Workflow for Metagenomics from raw reads to annotated bins.
Summary
- MetaBAT2 (binning)
- CheckM (bin completeness and contamination)
- GTDB-Tk (bin taxonomic classification)
- BUSCO (bin completeness)
All tool CWL files and other workflows can be found here:
Tools: https://git.wur.nl/unlock/cwl/-/tree/master/cwl
Workflows: https://git.wur.nl/unlock/cwl/-/tree/master/cwl/workflows
The dependencies are either accessible from https://unlock-icat.irods.surfsara.nl (anonymous,anonymous)
and/or
By using the conda / pip environments as shown in https://git.wur.nl/unlock/docker/-/blob/master/kubernetes/scripts/setup.sh
Inputs
ID | Name | Description | Type |
---|---|---|---|
identifier | Identifier used | Identifier for this dataset used in this workflow |
|
assembly | Assembly fasta | Assembly in fasta format |
|
bam_file | Bam file | Mapping file in sorted bam format containing reads mapped to the assembly |
|
threads | number of threads | Number of threads to use for computational processes |
|
memory | memory usage (mb) | Maximum memory usage in megabytes |
|
run_gtdbtk | Run GTDB-Tk | Run GTDB-Tk taxonomic bin classification when true |
|
busco_dataset | BUSCO dataset | Path to the BUSCO dataset download location |
|
step | CWL base step number | Step number for order of steps |
|
Steps
ID | Name | Description |
---|---|---|
metabat2_contig_depths | contig depths | MetabatContigDepths to obtain the depth file used in the MetaBat2 binning process |
contig_read_counts | samtools idxstats | Reports alignment summary statistics |
assembly_read_counts | samtools flagstat | Reports alignment summary statistics |
metabat2 | MetaBAT2 binning | Binning procedure using MetaBAT2 |
aggregate_bin_depths | Depths per bin | Depths per bin |
bins_stats | Bin assembly stats | Table of all bins and their assembly statistics like N50 |
bin_readstats | Bin and assembly read stats | Table general bin and assembly read mapping stats |
checkm | CheckM | CheckM bin quality assessment |
busco | BUSCO | BUSCO assembly completeness workflow |
merge_busco_summaries | Merge BUSCO summaries | n/a |
gtdbtk | GTDBTK | Taxomic assigment of bins with GTDB-Tk |
compress_gtdbtk | Compress GTDB-Tk | Compress GTDB-Tk output folder |
metabat_files_to_folder | MetaBat2 output folder | Preparation of MetaBat2 output files + unbinned contigs to a specific output folder |
checkm_files_to_folder | CheckM output | Preparation of CheckM output files to a specific output folder |
busco_files_to_folder | BUSCO output folder | Preparation of BUSCO output files to a specific output folder |
gtdbtk_files_to_folder | GTBD-Tk output folder | Preparation of GTDB-Tk output files to a specific output folder |
Outputs
ID | Name | Description | Type |
---|---|---|---|
metabat2_output | MetaBAT2 | MetaBAT2 output directory |
|
checkm_output | CheckM | CheckM output directory |
|
busco_output | BUSCO | BUSCO output directory |
|
gtdbtk_output | GTDB-Tk | GTDB-Tk output directory |
|
Version History
Version 11 (latest) Created 18th Oct 2021 at 10:49 by Jasper Koehorst
Added more binning and assembly reports
Open
master
0047812
Version 10 Created 7th Jun 2021 at 18:34 by Jasper Koehorst
Frozen
master
c2519b1
Version 9 Created 1st Jun 2021 at 11:43 by Jasper Koehorst
Frozen
master
d6fcbfa
Version 8 Created 6th May 2021 at 07:03 by Jasper Koehorst
Frozen
master
0660405
Version 7 Created 8th Jan 2021 at 10:15 by Jasper Koehorst
Frozen
master
f3919f2

Creators
Submitter
Views: 3228 Downloads: 76
Created: 15th Oct 2020 at 14:55
Last updated: 29th Apr 2022 at 09:45
Last used: 12th Aug 2022 at 02:42

None