Workflow for Illumina Quality Control and Filtering
Version 1

Workflow Type: Common Workflow Language
Stable

Workflow for Illumina Quality Control and Filtering

Multiple paired datasets will be merged into single paired dataset.

Summary:

  • FastQC on raw data files
  • fastp for read quality trimming
  • BBduk for phiX and (optional) rRNA filtering
  • Kraken2 for taxonomic classification of reads (optional)
  • BBmap for (contamination) filtering using given references (optional)
  • FastQC on filtered (merged) data

Other UNLOCK workflows on WorkflowHub: https://workflowhub.eu/projects/16/workflows?view=default

All tool CWL files and other workflows can be found here:
Tools: https://gitlab.com/m-unlock/cwl
Workflows: https://gitlab.com/m-unlock/cwl/workflows

How to setup and use an UNLOCK workflow:
https://m-unlock.gitlab.io/docs/setup/setup.html

Inputs

ID Name Description Type
identifier identifier used Identifier for this dataset used in this workflow
  • string
threads Number of threads Number of threads to use for computational processes
  • int?
memory Maximum memory in MB Maximum memory usage in MegaBytes
  • int?
forward_reads Forward reads Forward sequence fastq file(s) locally
  • File[]
reverse_reads Reverse reads Reverse sequence fastq file(s) locally
  • File[]
skip_fastqc_before Skip FastQC before Skip FastQC analyses of raw input data (default false)
  • boolean?
filter_rrna filter rRNA Optionally remove rRNA sequences from the reads (default false)
  • boolean?
filter_references Filter reference file(s) References fasta file(s) for filtering
  • File[]?
deduplicate Deduplicate reads Remove exact duplicate reads with fastp
  • boolean?
kraken2_confidence Kraken2 confidence threshold Confidence score threshold (default 0.0) must be between [0, 1]
  • float?
kraken2_database Kraken2 database Kraken2 database location, multiple databases is possible
  • Directory[]?
keep_reference_mapped_reads Keep mapped reads Keep with reads mapped to the given reference (default false)
  • boolean?
prepare_reference Prepare references Prepare references to a single fasta file and unique headers (default true). When false a single fasta file as reference is expected with unique headers
  • boolean
step Output Step number Step number for output folder numbering (default 1)
  • int?
destination Output Destination Optional output destination only used for cwl-prov reporting.
  • string?

Steps

ID Name Description
fastqc_illumina_before FastQC before Quality assessment and report of reads
fastq_merge_fwd Merge forward reads Merge multiple forward fastq reads to a single file
fastq_merge_rev Merge reverse reads Merge multiple reverse fastq reads to a single file
fastq_fwd_array_to_file Fwd reads array to file Forward file of single file array to file object
fastq_rev_array_to_file Rev reads array to file Forward file of single file array to file object
fastp fastp Read quality filtering and (barcode) trimming.
rrna_filter rRNA filter (bbduk) Filters rRNA sequences from reads using bbduk
illumina_quality_kraken2 Kraken2 Taxonomic classification of FASTQ reads
illumina_quality_kraken2_krona Krona Visualization of Kraken2 classification with Krona
prepare_fasta_db Prepare references Prepare references to a single fasta file and unique headers
reference_filter_illumina Reference read mapping Map reads against references using BBMap
phix_filter PhiX filter (bbduk) Filters illumina spike-in PhiX sequences from reads using bbduk
fastqc_illumina_after FastQC after Quality assessment and report of reads
reports_files_to_folder Reports to folder Preparation of fastp output files to a specific output folder

Outputs

ID Name Description Type
reports_folder Filtering reports folder Folder containing all reports of filtering and quality control
  • Directory
QC_forward_reads Filtered forward read Filtered forward read
  • File
QC_reverse_reads Filtered reverse read Filtered reverse read
  • File

Version History

Version 1 (earliest) Created 21st Apr 2022 at 14:00 by Bart Nijsse

Initial commit


Open master 82cc616
help Creators and Submitter
Discussion Channel
Activity

Views: 699

Created: 21st Apr 2022 at 14:00

Last updated: 2nd Feb 2023 at 15:13

help Attributions

None

Total size: 12.6 KB

Brought to you by:

Powered by
(v.1.13.0)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH