MGnify - raw-reads analysis pipeline
Version 1

Workflow Type: Common Workflow Language
Stable

MGnify (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the assembly, analysis and archiving of microbiome data derived from sequencing microbial populations that are present in particular environments. Over the past 2 years, MGnify (formerly EBI Metagenomics) has more than doubled the number of publicly available analysed datasets held within the resource. Recently, an updated approach to data analysis has been unveiled (version 5.0), replacing the previous single pipeline with multiple analysis pipelines that are tailored according to the input data, and that are formally described using the Common Workflow Language, enabling greater provenance, reusability, and reproducibility. MGnify's new analysis pipelines offer additional approaches for taxonomic assertions based on ribosomal internal transcribed spacer regions (ITS1/2) and expanded protein functional annotations. Biochemical pathways and systems predictions have also been added for assembled contigs. MGnify's growing focus on the assembly of metagenomic data has also seen the number of datasets it has assembled and analysed increase six-fold. The non-redundant protein database constructed from the proteins encoded by these assemblies now exceeds 1 billion sequences. Meanwhile, a newly developed contig viewer provides fine-grained visualisation of the assembled contigs and their enriched annotations.

Documentation: https://docs.mgnify.org/en/latest/analysis.html#raw-reads-analysis-pipeline

Click and drag the diagram to pan, double click or use the controls to zoom.

Inputs

ID Name Description Type
single_reads n/a n/a
  • File?
forward_reads n/a n/a
  • File?
reverse_reads n/a n/a
  • File?
qc_min_length n/a n/a
  • int
ssu_db n/a n/a
  • File
lsu_db n/a n/a
  • File
ssu_tax n/a n/a
  • string
lsu_tax n/a n/a
  • string
ssu_otus n/a n/a
  • string
lsu_otus n/a n/a
  • string
rfam_models n/a n/a
  • string[]
rfam_model_clans n/a n/a
  • string
other_ncRNA_models n/a n/a
  • string[]
ssu_label n/a n/a
  • string
lsu_label n/a n/a
  • string
5s_pattern n/a n/a
  • string
5.8s_pattern n/a n/a
  • string
CGC_config n/a n/a
  • string
CGC_postfixes n/a n/a
  • string[]
cgc_chunk_size n/a n/a
  • int
protein_chunk_size_hmm n/a n/a
  • int
protein_chunk_size_IPS n/a n/a
  • int
func_ann_names_ips n/a n/a
  • string
func_ann_names_hmmer n/a n/a
  • string
HMM_gathering_bit_score n/a n/a
  • boolean
HMM_omit_alignment n/a n/a
  • boolean
HMM_name_database n/a n/a
  • string
hmmsearch_header n/a n/a
  • string
EggNOG_db n/a n/a
  • string?
EggNOG_diamond_db n/a n/a
  • string?
EggNOG_data_dir n/a n/a
  • string?
InterProScan_databases n/a n/a
  • string
InterProScan_applications n/a n/a
  • string[]
InterProScan_outputFormat n/a n/a
  • string[]
ips_header n/a n/a
  • string
ko_file n/a n/a
  • string
go_config n/a n/a
  • string

Steps

ID Name Description
before-qc n/a n/a
after-qc n/a n/a
touch_file_flag n/a n/a
touch_no_cds_flag n/a n/a

Outputs

ID Name Description Type
qc-statistics n/a n/a
  • Directory
qc_summary n/a n/a
  • File
qc-status n/a n/a
  • File
hashsum_paired n/a n/a
  • File[]?
hashsum_single n/a n/a
  • File?
fastp_filtering_json_report n/a n/a
  • File?
sequence-categorisation_folder n/a n/a
  • Directory?
taxonomy-summary_folder n/a n/a
  • Directory?
rna-count n/a n/a
  • File?
motus_output n/a n/a
  • File?
compressed_files n/a n/a
  • File[]
functional_annotation_folder n/a n/a
  • Directory?
stats n/a n/a
  • Directory?
chunking_nucleotides n/a n/a
  • File[]?
chunking_proteins n/a n/a
  • File[]?
completed_flag_file n/a n/a
  • File?
no_cds_flag_file n/a n/a
  • File?
no_tax_flag_file n/a n/a
  • File?

Version History

v5.0.7 (earliest) Created 7th Jun 2022 at 09:40 by Martin Beracochea

Fix collect_scripts.py


Frozen v5.0.7 981aafc
help Creators and Submitter
Creators
Additional credit

Alex L Mitchell, Alexandre Almeida, Martin Beracochea, Miguel Boland, Josephine Burgin, Guy Cochrane, Michael R Crusoe, Varsha Kale, Simon C Potter, Lorna J Richardson, Ekaterina Sakharova, Maxim Scheremetjew, Anton Korobeynikov, Alex Shlemov, Olga Kunyavskaya, Alla Lapidus, Robert D Finn

Submitter
Discussion Channel
Citation
Sakharova, E., Kale, V., & Beracochea, M. (2022). MGnify - raw-reads analysis pipeline. WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.362.1
Activity

Views: 1373

Created: 7th Jun 2022 at 09:40

help Attributions

None

Total size: 367 MB
Powered by
(v.1.14.1)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH