Expertise: Bioinformatics, Computer Science, Data Management, Genetics, Genomics, Machine Learning, Metagenomics, NGS, Scientific workflow developement, Software Engineering
Tools: Databases, Galaxy, Genomics, Jupyter notebook, Machine Learning, Nextflow, nf-core, PCR, Perl, Python, R, rtPCR, Snakemake, Transcriptomics, Virology, Web, Web services, Workflows
Dad, husband and PhD. Scientist, technologist and engineer. Bibliophile. Philomath. Passionate about science, medicine, research, computing and all things geeky!
Teams: MAB - ATGC
Organizations: Centre National de la Recherche Scientifique (CNRS)

Roles: Project Coordinator
Expertise: Bioinformatics, Genomics, algorithm, Machine Learning, Metagenomics, NGS, Computer Science
Tools: Transcriptomics, Genomics, Python, C/C++, Web services, Workflows
Workflow for nanopore read quality control and contamination filtering.
- FastQC before filtering (read quality control)
- Kraken2 taxonomic read classification
- Minimap2 read filtering based on given references
- FastQC after filtering (read quality control)
All tool CWL files and other workflows can be found here: Tools: https://git.wur.nl/unlock/cwl/-/tree/master/cwl Workflows: https://git.wur.nl/unlock/cwl/-/tree/master/cwl/workflows
WorkflowHub: https://workflowhub.eu/projects/16/workflows?view=default ...
Type: Common Workflow Language
Creators: Bart Nijsse, Jasper Koehorst, Germán Royval
Submitter: Bart Nijsse
Workflow for sequencing with ONT Nanopore data, from basecalled reads to (meta)assembly and binning
- Workflow Nanopore Quality
- Kraken2 taxonomic classification of FASTQ reads
- Flye (de-novo assembly)
- Medaka (assembly polishing)
- metaQUAST (assembly quality reports)
When Illumina reads are provided:
- Workflow Illumina Quality: https://workflowhub.eu/workflows/336?version=1
- Assembly polishing with Pilon
- Workflow binnning https://workflowhub.eu/workflows/64?version=11
- Metabat2 ...
Type: Common Workflow Language
Creators: Bart Nijsse, Jasper Koehorst, Germán Royval
Submitter: Jasper Koehorst
Workflow for Illumina paired read quality control, trimming and filtering. Multiple paired datasets will be merged into single paired dataset. Summary:
- FastQC on raw data files
- fastp for read quality trimming
- BBduk for phiX and (optional) rRNA filtering
- Kraken2 for taxonomic classification of reads (optional)
- BBmap for (contamination) filtering using given references (optional)
- FastQC on filtered (merged) data
All tool CWL files and other workflows can be found here: Tools: ...
Bootstrapping-for-BQSR @ NCI-Gadi is a pipeline for bootstrapping a variant resource to enable GATK base quality score recalibration (BQSR) for non-model organisms that lack a publicly available variant resource. This implementation is optimised for the National Compute Infrastucture's Gadi HPC. Multiple rounds of bootstrapping can be performed. Users can use Fastq-to-bam @ NCI-Gadi and Germline-ShortV @ NCI-Gadi to ...
Local Cromwell implementation of GATK4 germline variant calling pipeline
See the GATK website for more information on this toolset
Assumptions
- Using hg38 human reference genome build
- Running 'locally' i.e. not using HPC/SLURM scheduling, or containers. This repo was specifically tested on Pawsey Nimbus 16 CPU, 64GB RAM virtual machine, primarily running in the
/data
volume storage partition. - Starting from short-read Illumina paired-end fastq ...
Fastq-to-BAM @ NCI-Gadi is a genome alignment workflow that takes raw FASTQ files, aligns them to a reference genome and outputs analysis ready BAM files. This workflow is designed for the National Computational Infrastructure's (NCI) Gadi supercompter, leveraging multiple nodes on NCI Gadi to run all stages of the workflow in parallel, either massively parallel using the scatter-gather approach or parallel by sample. It consists of a number of stages and follows the BROAD Institute's best practice ...
Type: Shell Script
Creators: Tracy Chew, Rosemarie Sadsad, Georgina Samaha, Cali Willet, Andrey Bliznyuk, Ben Menadue
Submitter: Georgina Samaha
SLURM HPC Cromwell implementation of GATK4 germline variant calling pipeline
See the GATK website for more information on this toolset
Assumptions
- Using hg38 human reference genome build
- Running using HPC/SLURM scheduling. This repo was specifically tested on Pawsey Zeus machine, primarily running in the
/scratch
partition. - Starting from short-read Illumina paired-end fastq files as input
Dependencies
The following versions have been ...
Germline-ShortV @ NCI-Gadi is an implementation of the BROAD Institute's best practice workflow for germline short variant discovery. This implementation is optimised for the National Compute Infrastucture's Gadi HPC, utilising scatter-gather parallelism to enable use of multiple nodes with high CPU or memory efficiency. This workflow requires sample BAM files, which can be generated using the Fastq-to-bam @ NCI-Gadi pipeline. Germline-ShortV can be applied ...
Type: Shell Script
Creators: Rosemarie Sadsad, Georgina Samaha, Tracy Chew, Cali Willet
Submitter: Tracy Chew
ORSON combine state-of-the-art tools for annotation processes within a Nextflow pipeline: sequence similarity search (PLAST, BLAST or Diamond), functional annotation retrieval (BeeDeeM) and functional prediction (InterProScan). When required, BUSCO completness evaluation and eggNOG Orthogroup annotation can be activated. While ORSON results can be analyzed through the command-line, it also offers the possibility to be compatible with BlastViewer or Blast2GO graphical tools.
Type: Nextflow
Creators: Cyril Noel, Alexandre Cormier, Patrick Durand, Laura Leroi, Pierre Cuzin
Submitter: Patrick Durand