Workflow Type: Common Workflow Language
Open
Frozen
Frozen
Frozen
Frozen
Work-in-progress
Workflow for Metagenomics binning from assembly
Minimal inputs are: Identifier, assembly (fasta) and a associated sorted BAM file
Summary
- MetaBAT2 (binning)
- MaxBin2 (binning)
- SemiBin (binning)
- DAS Tool (bin merging)
- EukRep (eukaryotic classification)
- CheckM (bin completeness and contamination)
- BUSCO (bin completeness)
- GTDB-Tk (bin taxonomic classification)
Other UNLOCK workflows on WorkflowHub: https://workflowhub.eu/projects/16/workflows?view=default
All tool CWL files and other workflows can be found here:
Tools: https://gitlab.com/m-unlock/cwl
Workflows: https://gitlab.com/m-unlock/cwl/workflows
How to setup and use an UNLOCK workflow:
https://m-unlock.gitlab.io/docs/setup/setup.html
Click and drag the diagram to pan, double click or use the controls to zoom.
Inputs
ID | Name | Description | Type |
---|---|---|---|
identifier | Identifier used | Identifier for this dataset used in this workflow |
|
assembly | Assembly fasta | Assembly in fasta format |
|
bam_file | Bam file | Mapping file in sorted bam format containing reads mapped to the assembly |
|
threads | Threads | Number of threads to use for computational processes |
|
memory | memory usage (MB) | Maximum memory usage in megabytes |
|
gtdbtk_data | gtdbtk data directory | Directory containing the GTDB database. When none is given GTDB-Tk will be skipped. |
|
busco_data | BUSCO dataset | Directory containing the BUSCO dataset location. |
|
run_semibin | Run SemiBin | Run with SemiBin binner |
|
semibin_environment | SemiBin Environment | Semibin Built-in models (human_gut/dog_gut/ocean/soil/cat_gut/human_oral/mouse_gut/pig_gut/built_environment/wastewater/global/chicken_caecum) |
|
sub_workflow | Sub workflow Run | Use this when you need the output bins as File[] for subsequent analysis workflow steps in another workflow. |
|
step | CWL base step number | Step number for order of steps |
|
destination | Output destination (not used in the workflow itself) | Optional output destination path for cwl-prov reporting. |
|
Steps
ID | Name | Description |
---|---|---|
metabat2_contig_depths | contig depths | MetabatContigDepths to obtain the depth file used in the MetaBat2 and SemiBin binning process |
eukrep | EukRep | EukRep, eukaryotic sequence classification |
eukrep_stats | EukRep stats | EukRep fasta statistics |
metabat2 | MetaBAT2 binning | Binning procedure using MetaBAT2 |
metabat2_filter_bins | Keep MetaBAT2 genome bins | Only keep genome bin fasta files (exlude e.g TooShort.fa) |
metabat2_contig2bin | MetaBAT2 to contig to bins | List the contigs and their corresponding bin. |
maxbin2 | MaxBin2 binning | Binning procedure using MaxBin2 |
maxbin2_to_folder | MaxBin2 bins to folder | Create folder with MaxBin2 bins |
maxbin2_contig2bin | MaxBin2 to contig to bins | List the contigs and their corresponding bin. |
semibin | Semibin binning | Binning procedure using SemiBin |
semibin_contig2bin | SemiBin to contig to bins | List the contigs and their corresponding bin. |
das_tool | DAS Tool integrate predictions from multiple binning tools | DAS Tool |
das_tool_bins | Bin dir to files[] | DAS Tool bins folder to File array for further analysis |
remove_unbinned | Remove unbinned | Remove unbinned fasta from bin directory. So analysed by subsequent tools. |
checkm | CheckM | CheckM bin quality assessment |
busco | BUSCO | BUSCO assembly completeness workflow |
gtdbtk | GTDBTK | Taxomic assigment of bins with GTDB-Tk |
compress_gtdbtk | Compress GTDB-Tk | Compress GTDB-Tk output folder |
aggregate_bin_depths | Depths per bin | Depths per bin |
bins_summary | Bins summary | Table of all bins and their statistics like size, contigs, completeness etc |
bin_readstats | Bin and assembly read stats | Table general bin and assembly read mapping stats |
metabat2_files_to_folder | MetaBAT2 output folder | Preparation of MetaBAT2 output files + unbinned contigs to a specific output folder |
maxbin2_files_to_folder | MaxBin2 output folder | Preparation of maxbin2 output files to a specific output folder. |
semibin_files_to_folder | SemiBin output folder | Preparation of SemiBin output files to a specific output folder. |
das_tool_files_to_folder | DAS Tool output folder | Preparation of DAS Tool output files to a specific output folder. |
checkm_files_to_folder | CheckM output | Preparation of CheckM output files to a specific output folder |
busco_files_to_folder | BUSCO output folder | Preparation of BUSCO output files to a specific output folder |
gtdbtk_files_to_folder | GTBD-Tk output folder | Preparation of GTDB-Tk output files to a specific output folder |
output_bin_files | Bin files | Bin files for subsequent workflow runs when sub_worflow = true |
Outputs
ID | Name | Description | Type |
---|---|---|---|
bins | Bin files | Bins files in fasta format. To be be used in other workflows. |
|
metabat2_output | MetaBAT2 | MetaBAT2 output directory |
|
maxbin2_output | MaxBin2 | MaxBin2 output directory |
|
semibin_output | SemiBin | MaxBin2 output directory |
|
das_tool_output | DAS Tool | DAS Tool output directory |
|
checkm_output | CheckM | CheckM output directory |
|
busco_output | BUSCO | BUSCO output directory |
|
gtdbtk_output | GTDB-Tk | GTDB-Tk output directory |
|
bins_summary_table | Bins summary | Summary of info about the bins |
|
bins_read_stats | Assembly/Bin read stats | General assembly and bin coverage |
|
eukrep_fasta | EukRep fasta | EukRep eukaryotic classified contigs |
|
eukrep_stats_file | EukRep stats | EukRep fasta statistics |
|
Version History
Version 11 (latest) Created 18th Oct 2021 at 10:49 by Jasper Koehorst
Added more binning and assembly reports
Open
master
d4c912c
Version 10 Created 7th Jun 2021 at 18:34 by Jasper Koehorst
No revision comments
Frozen
master
c2519b1
Version 9 Created 1st Jun 2021 at 11:43 by Jasper Koehorst
No revision comments
Frozen
master
d6fcbfa
Version 8 Created 6th May 2021 at 07:03 by Jasper Koehorst
No revision comments
Frozen
master
0660405
Version 7 Created 8th Jan 2021 at 10:15 by Jasper Koehorst
No revision comments
Frozen
master
f3919f2
Creators and Submitter
Creators
Submitter
Discussion Channel
License
Activity
Views: 10006 Downloads: 1086
Created: 15th Oct 2020 at 14:55
Last updated: 2nd Nov 2022 at 15:29
Annotated Properties
Attributions
None