RNASeq-DE @ NCI-Gadi processes RNA sequencing data (single, paired and/or multiplexed) for differential expression (raw FASTQ to counts). This pipeline consists of multiple stages and is designed for the National Computational Infrastructure's (NCI) Gadi supercompter, leveraging multiple nodes to run each stage in parallel.
Infrastructure_deployment_metadata: Gadi (NCI)
Description: Trinity @ NCI-Gadi contains a staged Trinity workflow that can be run on the National Computational Infrastructure’s (NCI) Gadi supercomputer. Trinity performs de novo transcriptome assembly of RNA-seq data by combining three independent software modules Inchworm, Chrysalis and Butterfly to process RNA-seq reads. The algorithm can detect isoforms, handle paired-end reads, multiple insert sizes and strandedness. ...
A porting of the Trinity RNA assembly pipeline, https://trinityrnaseq.github.io, that uses Nextflow to handle the underlying sub-tasks. This enables additional capabilities to better use HPC resources, such as packing of tasks to fill up nodes and use of node-local disks to improve I/O. By design, the pipeline separates the workflow logic (main file) and the cluster-specific configuration (config files), improving portability.
Based on a pipeline by Sydney Informatics Hub: ...
Workflow for Spliced RNAseq data Steps:
- FastQC (Read Quality Control)
- fastp (Read Trimming)
- STAR (Read mapping)
- featurecounts (transcript read counts)
- kallisto (transcript [pseudo]counts)