Generic variation analysis reporting
Version 1

Workflow Type: Galaxy
Stable

Generic variation analysis reporting

This workflow generates reports from a list of variants generated by Variant Calling Workflow.

The workflow accepts a single input:

  • A collection of VCF files

The workflow produces two outputs (format description below):

  1. A list of variants grouped by Sample
  2. A list of variants grouped by Variant

Here is example of output by sample. In this table all varinats in all samples are epxlicitrly listed:

Sample POS FILTER REF ALT DP AF AFcaller SB DP4 IMPACT FUNCLASS EFFECT GENE CODON AA TRID min(AF) max(AF) countunique(change) countunique(FUNCLASS) change
ERR3485786 11644 PASS A G 97 0.979381 0.907216 0 1,1,49,46 LOW SILENT SYNONYMOUS_CODING D7L tgT/tgC C512 AKG51361.1 0.979381 1 1 1 A>G
ERR3485786 11904 PASS T C 102 0.990196 0.95098 0 0,0,51,50 MODERATE MISSENSE NON_SYNONYMOUS_CODING D7L Act/Gct T426A AKG51361.1 0.990196 1 1 1 T>C

Note the two alernative allele frequency fields: "AFcaller" ans "AF". LoFreq reports AF values listed in "AFcaller". They incorrect due to the known LoFreq bug. To correct for this we are recomputing AF values from DP4 and DP fields as follows: AF == (DP4[2] + DP4[3]) / DP.

Here is an example of output by variant. In this table data is aggregated by variant across all samples in which this variant is present:

POS REF ALT IMPACT FUNCLASS EFFECT GENE CODON AA TRID countunique(Sample) min(AF) max(AF) SAMPLES(above-thresholds) SAMPLES(all) AFs(all) change
11644 A G LOW SILENT SYNONYMOUS_CODING D7L tgT/tgC C512 AKG51361.1 11 0.979381 1 ERR3485786,ERR3485787... ERR3485786,ERR3485787,ERR3485789 ... 0.979381,1.0... A>G
11904 T C MODERATE MISSENSE NON_SYNONYMOUS_CODING D7L Act/Gct T426A AKG51361.1 12 0.990196 1 ERR3485786,ERR3485787... ERR3485786,ERR3485787,ERR3485789... 0.990196,1.0,1.0... T>C

The workflow can be accessed at usegalaxy.org

The general idea of the workflow is:

Inputs

ID Name Description Type
AF Filter AF Filter Allele Frequency Filter. This is the minimum allele frequency required for variants to be included in the reports.
  • float?
DP Filter DP Filter Depth Filter. This is the minimum depth of all alignments at a variant site.
  • int?
DP_ALT Filter DP_ALT Filter Depth Filter for variant allele. This is the minimum depth of alignments supporting a variant.
  • int?
Variation data to report Variation data to report Variation data in VCF format. Can be the output of any of the workflows in https://github.com/galaxyproject/iwc/tree/main/workflows/sars-cov-2-variant-calling
  • File[]

Steps

ID Name Description
4 SnpSift Filter toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_filter/4.3+t.galaxy1
5 Compose text parameter value toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1
6 Compose text parameter value toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1
7 SnpSift Filter toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_filter/4.3+t.galaxy1
8 SnpSift Extract Fields toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_extractFields/4.3+t.galaxy0
9 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
10 Datamash toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0
11 Replace toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.3
12 Replace toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.3
13 Replace toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.3
14 Collapse Collection toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/5.1.0
15 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
16 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
17 Replace toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.3
18 Datamash toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0
19 Filter Filter1
20 Datamash toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0
21 Join toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_easyjoin_tool/1.1.2
22 Datamash toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0
23 Datamash toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0
24 Datamash toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0
25 Join toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_easyjoin_tool/1.1.2
26 Join toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_easyjoin_tool/1.1.2
27 Cut Cut1
28 Join toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_easyjoin_tool/1.1.2
29 Cut Cut1
30 Replace toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.3
31 Cut Cut1
32 Split file toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.0
33 Sort toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sort_header_tool/1.1.1
34 Sort toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sort_header_tool/1.1.1

Outputs

ID Name Description Type
prefiltered_variants prefiltered_variants n/a
  • File
filtered_variants filtered_variants n/a
  • File
filtered_extracted_variants filtered_extracted_variants n/a
  • File
af_recalculated af_recalculated n/a
  • File
collapsed_effects collapsed_effects n/a
  • File
highest_impact_effects highest_impact_effects n/a
  • File
cleaned_header cleaned_header n/a
  • File
processed_variants_collection processed_variants_collection n/a
  • File
all_variants_all_samples all_variants_all_samples n/a
  • File
variants_for_plotting variants_for_plotting n/a
  • File
by_variant_report by_variant_report n/a
  • File
combined_variant_report combined_variant_report n/a
  • File

Version History

Version 1 (earliest) Created 1st Jun 2022 at 16:36 by Anton Nekrutenko

Initial commit


Open master 0a39792
help Creators and Submitter
Creators
Not specified
Additional credit

Wolfgang Maier

Submitter
Activity

Views: 2149

Created: 1st Jun 2022 at 16:36

Last updated: 3rd Jun 2022 at 10:28

Annotated Properties
help Tags
help Attributions

None

Total size: 77 KB
Powered by
(v.1.14.1)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH