MetaGT: A pipeline for de novo assembly of metatranscriptomes with the aid of metagenomic data
Version 1

Workflow Type: Nextflow

Assembly and quantification metatranscriptome using metagenome data.

Version: see VERSION

Introduction

MetaGT is a bioinformatics analysis pipeline used for improving and quantification metatranscriptome assembly using metagenome data. The pipeline supports Illumina sequencing data and complete metagenome and metatranscriptome assemblies. The pipeline involves the alignment of metatranscriprome assembly to the metagenome assembly with further extracting CDSs, which are covered by transcripts.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible. The Nextflow DSL2 implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies.

Nextflow

install with bioconda

Quick Start

  1. Install nextflow

  2. Install any of Conda for full pipeline reproducibility

  3. Download the pipeline, e.g. by cloning metaGT GitHub repository:

    git clone git@github.com:ablab/metaGT.git
    
  4. Test it on a minimal dataset by running:

    nextflow run metaGT -profile test,conda
    
  5. Start running your own analysis!

    Typical command for analysis using reads:

    nextflow run metaGT -profile  --dna_reads '*_R{1,2}.fastq.gz' --rna_reads '*_R{1,2}.fastq.gz'
    

    Typical command for analysis using multiple files with reads:

    nextflow run metaGT -profile  --dna_reads '*.yaml' --rna_reads '*.yaml' --yaml
    

    Typical command for analysis using assemblies:

    nextflow run metaGT -profile  --genome '*.fasta' --transcriptome '*.fasta'
    

Pipeline Summary

Optionally, if raw reades are used:

  • Sequencing quality control (FastQC)
  • Assembly metagenome or metatranscriptome (metaSPAdes, rnaSPAdes )

By default, the pipeline currently performs the following:

  • Annotation metagenome (Prokka)
  • Aligning metatranscriptome on metagenome (minimap2)
  • Annotation unaligned transcripts (TransDecoder)
  • Clustering covered CDS and CDS from unaligned transcripts (MMseqs2)
  • Quantifying abundances of transcripts (kallisto)

Citation

MetaGT was developed by Daria Shafranskaya and Andrey Prjibelski. If you use it in your research please cite:

MetaGT: A pipeline for de novo assembly of metatranscriptomes with the aid of metagenomic data

Feedback and bug report

If you have any questions, please leave an issue at out GitHub page.

Version History

main @ 70395bb (earliest) Created 12th Apr 2023 at 10:18 by Varsha Kale

fix conda


Frozen main 70395bb
help Creators and Submitter
License
Activity

Views: 888

Created: 12th Apr 2023 at 10:18

help Attributions

None

Total size: 2.33 MB
Powered by
(v.1.14.1)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH