(Hybrid) Metagenomics workflow
Version 1

Workflow Type: Common Workflow Language
Work-in-progress

Workflow (hybrid) metagenomic assembly and binning + GEMs

Accepts both Illumina and Long reads (ONT/PacBio)

Workflow binnning https://workflowhub.eu/workflows/64?version=11 (optional)

  • Metabat2/MaxBin2/SemiBin
  • DAS Tool
  • CheckM
  • BUSCO
  • GTDB-Tk

Workflow Genome-scale metabolic models https://workflowhub.eu/workflows/372 (optional)

  • CarveMe (GEM generation)
  • MEMOTE (GEM test suite)
  • SMETANA (Species METabolic interaction ANAlysis)

Other UNLOCK workflows on WorkflowHub: https://workflowhub.eu/projects/16/workflows?view=default

All tool CWL files and other workflows can be found here:
https://gitlab.com/m-unlock/cwl

How to setup and use an UNLOCK workflow:
https://m-unlock.gitlab.io/docs/setup/setup.html

Inputs

ID Name Description Type
identifier Identifier used Identifier for this dataset used in this workflow
  • string
illumina_forward_reads Forward reads Forward sequence file path
  • File[]
illumina_reverse_reads Reverse reads Reverse sequence file path
  • File[]
pacbio_reads PacBio reads File with PacBio reads locally
  • File[]?
nanopore_reads PacBio reads File with PacBio reads locally
  • File[]?
filter_references Contamination reference file bbmap reference fasta file paths for contamination filtering
  • File[]?
use_reference_mapped_reads Keep mapped reads Continue with reads mapped to the given reference
  • boolean
keep_filtered_reads Keep filtered reads Keep filtered reads in the final output
  • boolean
deduplicate Deduplicate reads Remove exact duplicate reads with fastp
  • boolean?
kraken_database Kraken2 database Absolute path with database location of kraken2
  • Directory[]?
gtdbtk_data gtdbtk data directory Directory containing the GTDBTK repository
  • Directory?
busco_data BUSCO dataset Path to the BUSCO dataset download location
  • Directory?
ont_basecall_model ONT Basecalling model Basecalling model used with guppy default r941_min_high. Available: r941_trans, r941_flip213, r941_flip235, r941_min_fast, r941_min_high, r941_prom_fast, r941_prom_high. (required)
  • string?
pilon_fixlist Pilon fix list A comma-separated list of categories of issues to try to fix
  • string
metagenome When working with metagenomes Metagenome option for assemblers
  • boolean?
run_spades Use SPAdes Run with SPAdes assembler
  • boolean?
run_flye Use Flye Run with Flye assembler
  • boolean?
run_pilon Use Pilon Run with Pilon illumina assembly polishing
  • boolean?
binning Run binning workflow Run with contig binning workflow
  • boolean?
run_GEM Run GEM workflow Run the community genomescale metabolic models workflow on bins
  • boolean?
run_smetana Run SMETANA Run SMETANA (Species METabolic interaction ANAlysis)
  • boolean?
threads Number of threads Number of threads to use for computational processes
  • int?
memory Memory usage (MB) Maximum memory usage in megabytes
  • int?
destination Output Destination (prov only) Not used in this workflow. Output destination used for cwl-prov reporting only.
  • string?

Steps

ID Name Description
workflow_quality_illumina Quality and filtering workflow Quality assessment of illumina reads with rRNA filtering option
workflow_quality_nanopore Nanopore quality and filtering workflow Quality and filtering workflow for nanopore reads
nanopore_kraken2 Kraken2 Nanopore Taxonomic classification of nanopore FASTQ reads
illumina_kraken2 Kraken2 Illumina Taxonomic classification of illumina FASTQ reads
kraken2_compress Compress kraken2 Compress large kraken2 report file
kraken2_krona Krona Kraken2 Visualization of kraken2 with Krona
spades SPAdes assembly Genome assembly using spades with illumina/pacbio reads
compress_spades SPAdes compressed Compress the large Spades assembly output files
flye Nanopore Flye assembly De novo assembly of single-molecule reads with Flye
medaka Medaka polishing of assembly Medaka for polishing of assembled genome
metaquast_medaka assembly evaluation evaluation of polished assembly with metaQUAST
workflow_pilon Pilon worklow Illumina reads assembly polishing with Pilon
metaquast_pilon Illumina assembly evaluation Illumina evaluation of pilon polished assembly with metaQUAST
bbmap BBmap read mapping Illumina read mapping using BBmap on assembled contigs
sam_to_sorted_bam sam conversion to sorted bam Sam file conversion to a sorted indexed bam file
contig_read_counts Samtools idxstats Reports alignment summary statistics
workflow_binning Binning workflow Binning workflow to create bins
workflow_GEM GEM workflow CarveMe community genomescale metabolic models workflow from bins
keep_readfilter_files_to_folder Read filtering output folder Preparation of read filtering output files to a specific output folder
readfilter_files_to_folder Read filtering output folder Preparation of read filtering output files to a specific output folder
kraken2_files_to_folder Kraken2 output folder Preparation of Kraken2 output files to a specific output folder
spades_files_to_folder SPADES output to folder Preparation of SPAdes output files to a specific output folder
flye_files_to_folder Flye output folder Preparation of Flye output files to a specific output folder
metaquast_medaka_files_to_folder Nanopore metaQUAST output folder Preparation of metaQUAST output files to a specific output folder
medaka_files_to_folder Medaka output folder Preparation of Medaka output files to a specific output folder
metaquast_pilon_files_to_folder Illumina metaQUAST output folder Preparation of QUAST output files to a specific output folder
pilon_files_to_folder Pilon output folder Preparation of pilon output files to a specific output folder
assembly_files_to_folder Flye output folder Preparation of Flye output files to a specific output folder
binning_files_to_folder Binning output to folder Preparation of binning output files and folders to a specific output folder
GEM_files_to_folder GEM workflow output to folder Preparation of GEM workflow output files and folders to a specific output folder

Outputs

ID Name Description Type
read_filtering_output_keep Read filtering output Read filtering stats + filtered reads
  • Directory?
read_filtering_output Read filtering output Read filtering stats + filtered reads
  • Directory?
kraken2_output Kraken2 reports Kraken2 taxonomic classification reports
  • Directory?
assembly_output Assembly output Output from different assembly steps
  • Directory
binning_output Binning output Binning outputfolders
  • Directory?
gem_output Community GEM output Community GEM output folder
  • Directory?

Version History

Version 1 (earliest) Created 14th Jun 2022 at 09:14 by Bart Nijsse

Initial commit


Open master 1e42c47
help Creators and Submitter
Discussion Channel
Activity

Views: 1102

Created: 14th Jun 2022 at 09:14

Last updated: 7th Apr 2023 at 15:10

Annotated Properties
help Attributions

None

Total size: 581 KB

Brought to you by:

Powered by
(v.1.13.3)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH