(Hybrid) Metagenomics workflow
Version 1

Workflow Type: Common Workflow Language
Work-in-progress

Workflow (hybrid) metagenomic assembly and binning + GEMs

Accepts both Illumina and Long reads (ONT/PacBio)

Workflow binnning https://workflowhub.eu/workflows/64?version=11 (optional)

  • Metabat2/MaxBin2/SemiBin
  • DAS Tool
  • CheckM
  • BUSCO
  • GTDB-Tk

Workflow Genome-scale metabolic models https://workflowhub.eu/workflows/372 (optional)

  • CarveMe (GEM generation)
  • MEMOTE (GEM test suite)
  • SMETANA (Species METabolic interaction ANAlysis)

Other UNLOCK workflows on WorkflowHub: https://workflowhub.eu/projects/16/workflows?view=default

All tool CWL files and other workflows can be found here:
https://gitlab.com/m-unlock/cwl

How to setup and use an UNLOCK workflow:
https://m-unlock.gitlab.io/docs/setup/setup.html

Click and drag the diagram to pan, double click or use the controls to zoom.

Inputs

ID Name Description Type
identifier Identifier used Identifier for this dataset used in this workflow
  • string
illumina_forward_reads Forward reads Forward sequence file path
  • File[]
illumina_reverse_reads Reverse reads Reverse sequence file path
  • File[]
pacbio_reads PacBio reads File with PacBio reads locally
  • File[]?
nanopore_reads PacBio reads File with PacBio reads locally
  • File[]?
filter_references Contamination reference file bbmap reference fasta file paths for contamination filtering
  • File[]?
use_reference_mapped_reads Keep mapped reads Continue with reads mapped to the given reference
  • boolean
keep_filtered_reads Keep filtered reads Keep filtered reads in the final output
  • boolean
deduplicate Deduplicate reads Remove exact duplicate reads with fastp
  • boolean?
kraken_database Kraken2 database Absolute path with database location of kraken2
  • Directory[]?
gtdbtk_data gtdbtk data directory Directory containing the GTDBTK repository
  • Directory?
busco_data BUSCO dataset Path to the BUSCO dataset download location
  • Directory?
ont_basecall_model ONT Basecalling model Basecalling model used with guppy default r941_min_high. Available: r941_trans, r941_flip213, r941_flip235, r941_min_fast, r941_min_high, r941_prom_fast, r941_prom_high. (required)
  • string?
pilon_fixlist Pilon fix list A comma-separated list of categories of issues to try to fix
  • string
metagenome When working with metagenomes Metagenome option for assemblers
  • boolean?
run_spades Use SPAdes Run with SPAdes assembler
  • boolean?
run_flye Use Flye Run with Flye assembler
  • boolean?
run_pilon Use Pilon Run with Pilon illumina assembly polishing
  • boolean?
binning Run binning workflow Run with contig binning workflow
  • boolean?
run_GEM Run GEM workflow Run the community genomescale metabolic models workflow on bins
  • boolean?
run_smetana Run SMETANA Run SMETANA (Species METabolic interaction ANAlysis)
  • boolean?
threads Number of threads Number of threads to use for computational processes
  • int?
memory Memory usage (MB) Maximum memory usage in megabytes
  • int?
destination Output Destination (prov only) Not used in this workflow. Output destination used for cwl-prov reporting only.
  • string?

Steps

ID Name Description
workflow_quality_illumina Quality and filtering workflow Quality assessment of illumina reads with rRNA filtering option
workflow_quality_nanopore Nanopore quality and filtering workflow Quality and filtering workflow for nanopore reads
nanopore_kraken2 Kraken2 Nanopore Taxonomic classification of nanopore FASTQ reads
illumina_kraken2 Kraken2 Illumina Taxonomic classification of illumina FASTQ reads
kraken2_compress Compress kraken2 Compress large kraken2 report file
kraken2_krona Krona Kraken2 Visualization of kraken2 with Krona
spades SPAdes assembly Genome assembly using spades with illumina/pacbio reads
compress_spades SPAdes compressed Compress the large Spades assembly output files
flye Nanopore Flye assembly De novo assembly of single-molecule reads with Flye
medaka Medaka polishing of assembly Medaka for polishing of assembled genome
metaquast_medaka assembly evaluation evaluation of polished assembly with metaQUAST
workflow_pilon Pilon worklow Illumina reads assembly polishing with Pilon
metaquast_pilon Illumina assembly evaluation Illumina evaluation of pilon polished assembly with metaQUAST
bbmap BBmap read mapping Illumina read mapping using BBmap on assembled contigs
sam_to_sorted_bam sam conversion to sorted bam Sam file conversion to a sorted indexed bam file
contig_read_counts Samtools idxstats Reports alignment summary statistics
workflow_binning Binning workflow Binning workflow to create bins
workflow_GEM GEM workflow CarveMe community genomescale metabolic models workflow from bins
keep_readfilter_files_to_folder Read filtering output folder Preparation of read filtering output files to a specific output folder
readfilter_files_to_folder Read filtering output folder Preparation of read filtering output files to a specific output folder
kraken2_files_to_folder Kraken2 output folder Preparation of Kraken2 output files to a specific output folder
spades_files_to_folder SPADES output to folder Preparation of SPAdes output files to a specific output folder
flye_files_to_folder Flye output folder Preparation of Flye output files to a specific output folder
metaquast_medaka_files_to_folder Nanopore metaQUAST output folder Preparation of metaQUAST output files to a specific output folder
medaka_files_to_folder Medaka output folder Preparation of Medaka output files to a specific output folder
metaquast_pilon_files_to_folder Illumina metaQUAST output folder Preparation of QUAST output files to a specific output folder
pilon_files_to_folder Pilon output folder Preparation of pilon output files to a specific output folder
assembly_files_to_folder Flye output folder Preparation of Flye output files to a specific output folder
binning_files_to_folder Binning output to folder Preparation of binning output files and folders to a specific output folder
GEM_files_to_folder GEM workflow output to folder Preparation of GEM workflow output files and folders to a specific output folder

Outputs

ID Name Description Type
read_filtering_output_keep Read filtering output Read filtering stats + filtered reads
  • Directory?
read_filtering_output Read filtering output Read filtering stats + filtered reads
  • Directory?
kraken2_output Kraken2 reports Kraken2 taxonomic classification reports
  • Directory?
assembly_output Assembly output Output from different assembly steps
  • Directory
binning_output Binning output Binning outputfolders
  • Directory?
gem_output Community GEM output Community GEM output folder
  • Directory?

Version History

Version 1 (earliest) Created 14th Jun 2022 at 09:14 by Bart Nijsse

Initial commit


Frozen Version-1 1e42c47
help Creators and Submitter
Discussion Channel
Activity

Views: 2040

Created: 14th Jun 2022 at 09:14

Last updated: 7th Apr 2023 at 15:10

Annotated Properties
help Attributions

None

Total size: 581 KB
Powered by
(v.1.14.1)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH