COVID-19 sequence analysis on Illumina Amplicon PE data
This workflow implements an iVar based analysis similar to the one in ncov2019-artic-nf, covid-19-signal and the Thiagen Titan workflow. These workflows (written in Nextflow, Snakemake and WDL) are widely in use in COG UK, CanCOGeN and some US state public health laboratories.
This workflow is also the subject of a Galaxy Training Network tutorial (currently a Work in Progress).
It differs from this workflow in
that it does not use lofreq and is aimed at rapid analysis of majority variants and lineage/clade assignment with pangolin and nextclade.
TODO:
- Add support for QC using negative and positive controls
- Integrate with phylogeny tools including IQTree and UShER (and possibly more).
Inputs
| ID | Name | Description | Type |
|---|---|---|---|
| Paired read collection for samples | Paired read collection for samples | FASTQ format Illumina Reads (Amplicon Protocol) | n/a |
| Reference FASTA | Reference FASTA | SARS-CoV-2 reference genome (typically MN908947.3) | n/a |
| Primer BED | Primer BED | Primer BED file (from ARTIC project or similar) | n/a |
| Read fraction to call variant | Read fraction to call variant | Specify the proportion of reads that need to agree with each other to call a variant. This is a floating point value between 0 and 1. | n/a |
| Minimum quality score to call base | Minimum quality score to call base | Minimum base quality score to count a base towards the sequence consensus. | n/a |
| input_bam | input_bam | runtime parameter for tool ivar trim | n/a |
| primer | primer | runtime parameter for tool ivar trim | n/a |
| intervals | intervals | runtime parameter for tool SnpEff eff: | n/a |
| transcripts | transcripts | runtime parameter for tool SnpEff eff: | n/a |
| inputs | inputs | runtime parameter for tool Concatenate datasets | n/a |
| input1 | input1 | runtime parameter for tool Pangolin | n/a |
| input_fasta | input_fasta | runtime parameter for tool Nextclade | n/a |
Steps
| ID | Name | Description |
|---|---|---|
| 0 | Paired read collection for samples | FASTQ format Illumina Reads (Amplicon Protocol) |
| 1 | Reference FASTA | SARS-CoV-2 reference genome (typically MN908947.3) |
| 2 | Primer BED | Primer BED file (from ARTIC project or similar) |
| 3 | Read fraction to call variant | Specify the proportion of reads that need to agree with each other to call a variant. This is a floating point value between 0 and 1. |
| 4 | Minimum quality score to call base | Minimum base quality score to count a base towards the sequence consensus. |
| 5 | fastp: Trimmed Illumina Reads | toolshed.g2.bx.psu.edu/repos/iuc/fastp/fastp/0.20.1+galaxy0 |
| 6 | Rename reference to NC_045512.2 | If the reference is named MN908947.3 (Genbank name of SARS-CoV-2 reference genome), rename it to NC_045512.2 (RefSeq name of SARS-CoV-2 reference genome) toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1 |
| 7 | Map with BWA-MEM | toolshed.g2.bx.psu.edu/repos/devteam/bwa/bwa_mem/0.7.17.1 |
| 8 | Samtools stats | toolshed.g2.bx.psu.edu/repos/devteam/samtools_stats/samtools_stats/2.0.2+galaxy2 |
| 9 | Samtools view | toolshed.g2.bx.psu.edu/repos/iuc/samtools_view/samtools_view/1.9+galaxy3 |
| 10 | QualiMap BamQC | toolshed.g2.bx.psu.edu/repos/iuc/qualimap_bamqc/qualimap_bamqc/2.2.2d+galaxy3 |
| 11 | ivar trim | toolshed.g2.bx.psu.edu/repos/iuc/ivar_trim/ivar_trim/1.3.1+galaxy2 |
| 12 | Flatten Collection | __FLATTEN__ |
| 13 | ivar variants | toolshed.g2.bx.psu.edu/repos/iuc/ivar_variants/ivar_variants/1.3.1+galaxy2 |
| 14 | ivar consensus | toolshed.g2.bx.psu.edu/repos/iuc/ivar_consensus/ivar_consensus/1.3.1+galaxy0 |
| 15 | Quality Control Report | toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.9+galaxy1 |
| 16 | Annotated variants | toolshed.g2.bx.psu.edu/repos/iuc/snpeff_sars_cov_2/snpeff_sars_cov_2/4.5covid19 |
| 17 | Consensus genome (masked for depth) | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1 |
| 18 | Concatenate datasets | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cat/0.1.1 |
| 19 | Pangolin | toolshed.g2.bx.psu.edu/repos/iuc/pangolin/pangolin/3.1.14+galaxy0 |
| 20 | Nextclade | toolshed.g2.bx.psu.edu/repos/iuc/nextclade/nextclade/1.4.1+galaxy0 |
Outputs
| ID | Name | Description | Type |
|---|---|---|---|
| output_paired_coll | output_paired_coll | n/a | input |
| report_html | report_html | n/a | html |
| report_json | report_json | n/a | json |
| output | output | n/a | input |
| bam_output | bam_output | n/a | bam |
| output | output | n/a | tabular |
| outputsam | outputsam | n/a | input |
| raw_data | raw_data | n/a | input |
| output_html | output_html | n/a | html |
| output_bam | output_bam | n/a | bam |
| output | output | n/a | input |
| output_variants_tabular | output_variants_tabular | n/a | tabular |
| output_variants_vcf | output_variants_vcf | n/a | vcf |
| consensus | consensus | n/a | fasta |
| stats | stats | n/a | input |
| html_report | html_report | n/a | html |
| snpeff_output | snpeff_output | n/a | vcf |
| statsFile | statsFile | n/a | html |
| output | output | n/a | input |
| out_file1 | out_file1 | n/a | input |
| output1 | output1 | n/a | tabular |
| report_tsv | report_tsv | n/a | tabular |
Creators and SubmitterCreator
Submitter
Views: 383 Downloads: 32
Created: 31st Aug 2021 at 03:01
Last updated: 5th Nov 2021 at 03:00
Last used: 9th Dec 2021 at 12:18
AttributionsNone
Version History
Version 3 (latest) Created 5th Nov 2021 at 03:00 by WorkflowHub Bot
Updated to v0.2.1
Version 2 Created 27th Oct 2021 at 15:45 by WorkflowHub Bot
Updated to v0.2
Version 1 (earliest) Created 31st Aug 2021 at 03:01 by WorkflowHub Bot
No revision comments
View on GitHub
Run on usegalaxy.eu