ONTViSc (ONT-based Viral Screening for Biosecurity)

Introduction

eresearchqut/ontvisc is a Nextflow-based bioinformatics pipeline designed to help diagnostics of viruses and viroid pathogens for biosecurity. It takes fastq files generated from either amplicon or whole-genome sequencing using Oxford Nanopore Technologies as input.

The pipeline can either: 1) perform a direct search on the sequenced reads, 2) generate clusters, 3) assemble the reads to generate longer contigs or 4) directly map reads to a known reference.

The reads can optionally be filtered from a plant host before performing downstream analysis.

Pipeline overview

diagram pipeline

Data quality check (QC) and preprocessing
- Merge fastq files (optional)
- Raw fastq file QC (Nanoplot)
- Trim adaptors (PoreChop ABI - optional)
- Filter reads based on length and/or quality (Chopper - optional)
- Reformat fastq files so read names are trimmed after the first whitespace (bbmap)
- Processed fastq file QC (if PoreChop and/or Chopper is run) (Nanoplot)
Host read filtering
- Align reads to host reference provided (Minimap2)
- Extract reads that do not align for downstream analysis (seqtk)
QC report
- Derive read counts recovered pre and post data processing and post host filtering
Read classification analysis mode
Clustering mode
- Read clustering (Rattle)
- Convert fastq to fasta format (seqtk)
- Cluster scaffolding (Cap3)
- Megablast homology search against ncbi or custom database (blast)
- Derive top candidate viral hits
De novo assembly mode
- De novo assembly (Canu or Flye)
- Megablast homology search against ncbi or custom database or reference (blast)
- Derive top candidate viral hits
Read classification mode
- Option 1 Nucleotide-based taxonomic classification of reads (Kraken2, Braken)
- Option 2 Protein-based taxonomic classification of reads (Kaiju, Krona)
- Option 3 Convert fastq to fasta format (seqtk) and perform direct homology search using megablast (blast)
Map to reference mode
- Align reads to reference fasta file (Minimap2) and derive bam file and alignment statistics (Samtools)

Detailed instructions can be found in wiki.

To do

Derive consensus sequence in blast2ref mode
Finalise output section wiki documentation

Authors

Marie-Emilie Gauthier
Craig Windell
Magdalena Antczak
Roberto Barrero

Version History

main @ 2274c83 (latest) Created 18th Dec 2024 at 04:26 by Magdalena Antczak

update pipeline figure

Frozen main 2274c83

v1.3 Created 18th Dec 2024 at 04:23 by Magdalena Antczak

update test command

Frozen v1.3 049bd72

main @ d333445 (earliest) Created 4th Dec 2023 at 01:42 by Magdalena Antczak

update conditions in preprocessing steps

Frozen main d333445

ONTViSc (ONT-based Viral Screening for Biosecurity)
main @ d333445 (earliest)

main @ 2274c83 (latest)

v1.3

main @ d333445 (earliest)

ONTViSc (ONT-based Viral Screening for Biosecurity)

Introduction

Pipeline overview

To do

Authors

Version History

main @ 2274c83 (latest) Created 18th Dec 2024 at 04:26 by Magdalena Antczak

v1.3 Created 18th Dec 2024 at 04:23 by Magdalena Antczak

main @ d333445 (earliest) Created 4th Dec 2023 at 01:42 by Magdalena Antczak

Creators

Submitter

ONTViSc (ONT-based Viral Screening for Biosecurity) main @ d333445 (earliest) main @ 2274c83 (latest) v1.3 main @ d333445 (earliest)

ONTViSc (ONT-based Viral Screening for Biosecurity)

Introduction

Pipeline overview

To do

Authors

Version History

main @ 2274c83 (latest) Created 18th Dec 2024 at 04:26 by Magdalena Antczak

v1.3 Created 18th Dec 2024 at 04:23 by Magdalena Antczak

main @ d333445 (earliest) Created 4th Dec 2023 at 01:42 by Magdalena Antczak

Creators

Submitter

Related items

ONTViSc (ONT-based Viral Screening for Biosecurity)
main @ d333445 (earliest)

main @ 2274c83 (latest)

v1.3

main @ d333445 (earliest)