Workflow (hybrid) metagenomic assembly and binning + GEMs
Accepts both Illumina and Long reads (ONT/PacBio)
-
Workflow Illumina Quality: https://workflowhub.eu/workflows/336?version=1
-
Workflow LongRead Quality: https://workflowhub.eu/workflows/337
-
Kraken2 taxonomic classification of FASTQ reads
-
SPAdes/Flye (Assembly)
-
QUAST (Assembly quality report)
Workflow binnning https://workflowhub.eu/workflows/64?version=11 (optional)
- Metabat2/MaxBin2/SemiBin
- DAS Tool
- CheckM ...
Workflow for Illumina Quality Control and Filtering
Multiple paired datasets will be merged into single paired dataset.
Summary:
- FastQC on raw data files
- fastp for read quality trimming
- BBduk for phiX and (optional) rRNA filtering
- Kraken2 for taxonomic classification of reads (optional)
- BBmap for (contamination) filtering using given references (optional)
- FastQC on filtered (merged) data
Other UNLOCK workflows on WorkflowHub: https://workflowhub.eu/projects/16/workflows?view=default ...
Bootstrapping-for-BQSR @ NCI-Gadi is a pipeline for bootstrapping a variant resource to enable GATK base quality score recalibration (BQSR) for non-model organisms that lack a publicly available variant resource. This implementation is optimised for the National Compute Infrastucture's Gadi HPC. Multiple rounds of bootstrapping can be performed. Users can use Fastq-to-bam @ NCI-Gadi and Germline-ShortV @ NCI-Gadi to ...
RNASeq-DE @ NCI-Gadi processes RNA sequencing data (single, paired and/or multiplexed) for differential expression (raw FASTQ to counts). This pipeline consists of multiple stages and is designed for the National Computational Infrastructure's (NCI) Gadi supercompter, leveraging multiple nodes to run each stage in parallel.
Infrastructure_deployment_metadata: Gadi (NCI)
Flashlite-Trinity contains two workflows that run Trinity on the University of Queensland's HPC, Flashlite. Trinity performs de novo transcriptome assembly of RNA-seq data by combining three independent software modules Inchworm, Chrysalis and Butterfly to process RNA-seq reads. The algorithm can detect isoforms, handle paired-end reads, multiple insert sizes and strandedness. Users can run Flashlite-Trinity on single samples, or smaller samples requiring <500Gb ...
Type: Shell Script
Creators: Tracy Chew, Rosemarie Sadsad, Georgina Samaha, Cali Willet
Submitter: Tracy Chew