Work-in-progress

Workflow for Metagenomics from raw reads to bins.

Steps:

  • workflow_quality.cwl:

    • FastQC (control)
    • fastp (quality trimming)
    • bbmap contamination filter
  • SPAdes (Assembly)

  • QUAST (Assembly quality report)

  • BBmap (Read mapping to assembly)

  • MetaBat2 (binning)

  • CheckM (bin completeness and contamination)

  • GTDB-Tk (bin taxonomic classification)

Inputs

ID Name Description Type
identifier identifier used Identifier for this dataset used in this workflow string
forward_reads forward reads forward sequence file locally File[]
reverse_reads reverse reads reverse sequence file locally File[]
threads number of threads number of threads to use for computational processes int?
memory memory usage (mb) maximum memory usage in megabytes int?
pacbio_reads pacbio reads file with PacBio reads locally File[]?
bbmap_reference contamination reference file bbmap reference fasta file for contamination filtering string
run_gtdbtk Run GTDB-Tk Run GTDB-Tk taxonomic bin classification when true boolean

Steps

ID Name Description
workflow_quality Quality and filtering workflow Quality assessment of illumina reads with rRNA filtering option
workflow_spades SPADES assembly Genome assembly using spades with illumina/pacbio reads
workflow_quast Quast workflow Genome assembly quality assessment using Quast
workflow_bbmap bbmap read mapping illumina read mapping using BBmap
workflow_sam_to_sorted_bam sam conversion to sorted bam sam file conversion to a sorted indexed bam file
workflow_metabat2_contig_depths depth file from metabat2 Execution of metabat2 to obtain the depth file used in the binning process
workflow_metabat2 binning process Binning procedure using metabat2
workflow_checkm CheckM CheckM bin quality assessment
workflow_getunbinned unbinned_contigs Get unbinned contigs fasta
workflow_gtdbtk GTDBTK Taxomic assigment of bins with GTDB-Tk
workflow_compress_gtdbtk Compress GTDB-Tk Compress GTDB-Tk output folder
compress_spades Spades compressed Compress the large Spades files
spades_files_to_folder SPADES output Preparation of spades output files to a specific output folder
quast_files_to_folder QUAST output Preparation of quast output files to a specific output folder
sorted_bam_files_to_folder BAM output Preparation of bam files output to a specific output folder
metabat_files_to_folder MetaBat2 output Preparation of MetaBat2 output files + unbinned contigs to a specific output folder
checkm_files_to_folder CheckM output Preparation of CheckM output files to a specific output folder
gtdbtk_files_to_folder gtdbtk output Preparation of GTDB-Tk output files to a specific output folder

Outputs

ID Name Description Type
fastqc_output FASTQC Quality reporting by FASTQC Directory
filter_output Filtered reads Reads filtered output folder Directory
spades_output SPADES Metagenome assembly output by SPADES Directory
quast_output QUAST Quast analysis output folder Directory
bam_output BAM files Mapping results in indexed BAM format Directory
metabat2_output MetaBat2 MetaBat2 output directory Directory
checkm_output CheckM CheckM output directory Directory
gtdbtk_output GTDB-Tk GTDB-Tk output directory Directory
help Creators and Submitter
Discussion Channel
License
Activity

Views: 861   Downloads: 47

Created: 15th Oct 2020 at 14:55

Last updated: 7th Jun 2021 at 18:35

Last used: 24th Jun 2021 at 00:44

help Attributions

None

Version History

Version 10 (latest) Created 7th Jun 2021 at 18:34 by Jasper Koehorst

No revision comments

Version 9 Created 1st Jun 2021 at 11:43 by Jasper Koehorst

No revision comments

Version 8 Created 6th May 2021 at 07:03 by Jasper Koehorst

No revision comments

Version 7 Created 8th Jan 2021 at 10:15 by Jasper Koehorst

No revision comments

Related items

Powered by
(v.1.11.0-rc1)
Copyright © 2008 - 2021 The University of Manchester and HITS gGmbH