Workflow Type: Common Workflow Language
Work-in-progress

Workflow for Metagenomics from raw reads to annotated bins.
Summary

  • MetaBAT2 (binning)
  • CheckM (bin completeness and contamination)
  • GTDB-Tk (bin taxonomic classification)
  • BUSCO (bin completeness)

All tool CWL files and other workflows can be found here:
Tools: https://git.wur.nl/unlock/cwl/-/tree/master/cwl
Workflows: https://git.wur.nl/unlock/cwl/-/tree/master/cwl/workflows

The dependencies are either accessible from https://unlock-icat.irods.surfsara.nl (anonymous,anonymous)
and/or
By using the conda / pip environments as shown in https://git.wur.nl/unlock/docker/-/blob/master/kubernetes/scripts/setup.sh

Inputs

ID Name Description Type
identifier Identifier used Identifier for this dataset used in this workflow
  • string
assembly Assembly fasta Assembly in fasta format
  • File
bam_file Bam file Mapping file in sorted bam format containing reads mapped to the assembly
  • File
threads number of threads Number of threads to use for computational processes
  • int?
memory memory usage (mb) Maximum memory usage in megabytes
  • int?
run_gtdbtk Run GTDB-Tk Run GTDB-Tk taxonomic bin classification when true
  • boolean
busco_dataset BUSCO dataset Path to the BUSCO dataset download location
  • string
step CWL base step number Step number for order of steps
  • int?

Steps

ID Name Description
metabat2_contig_depths contig depths MetabatContigDepths to obtain the depth file used in the MetaBat2 binning process
contig_read_counts samtools idxstats Reports alignment summary statistics
assembly_read_counts samtools flagstat Reports alignment summary statistics
metabat2 MetaBAT2 binning Binning procedure using MetaBAT2
aggregate_bin_depths Depths per bin Depths per bin
bins_stats Bin assembly stats Table of all bins and their assembly statistics like N50
bin_readstats Bin and assembly read stats Table general bin and assembly read mapping stats
checkm CheckM CheckM bin quality assessment
busco BUSCO BUSCO assembly completeness workflow
merge_busco_summaries Merge BUSCO summaries n/a
gtdbtk GTDBTK Taxomic assigment of bins with GTDB-Tk
compress_gtdbtk Compress GTDB-Tk Compress GTDB-Tk output folder
metabat_files_to_folder MetaBat2 output folder Preparation of MetaBat2 output files + unbinned contigs to a specific output folder
checkm_files_to_folder CheckM output Preparation of CheckM output files to a specific output folder
busco_files_to_folder BUSCO output folder Preparation of BUSCO output files to a specific output folder
gtdbtk_files_to_folder GTBD-Tk output folder Preparation of GTDB-Tk output files to a specific output folder

Outputs

ID Name Description Type
metabat2_output MetaBAT2 MetaBAT2 output directory
  • Directory
checkm_output CheckM CheckM output directory
  • Directory
busco_output BUSCO BUSCO output directory
  • Directory
gtdbtk_output GTDB-Tk GTDB-Tk output directory
  • Directory?

Version History

Version 11 (latest) Created 18th Oct 2021 at 10:49 by Jasper Koehorst

Added more binning and assembly reports


Open master 0047812

Version 10 Created 7th Jun 2021 at 18:34 by Jasper Koehorst

No revision comments

Frozen master c2519b1

Version 9 Created 1st Jun 2021 at 11:43 by Jasper Koehorst

No revision comments

Frozen master d6fcbfa

Version 8 Created 6th May 2021 at 07:03 by Jasper Koehorst

No revision comments

Frozen master 0660405

Version 7 Created 8th Jan 2021 at 10:15 by Jasper Koehorst

No revision comments

Frozen master f3919f2
help Creators and Submitter
Discussion Channel
Activity

Views: 3228   Downloads: 76

Created: 15th Oct 2020 at 14:55

Last updated: 29th Apr 2022 at 09:45

Last used: 12th Aug 2022 at 02:42

help Attributions

None

Total size: 11.2 KB
Powered by
(v.1.12.2)
Copyright © 2008 - 2022 The University of Manchester and HITS gGmbH

By continuing to use this site you agree to the use of cookies