Workflows
What is a Workflow?Filters
Genome Assembly with Hifi reads and Trio Data
Generate phased assembly based on PacBio Hifi Reads using parental Illumina data for phasing. Part of the VGP workflow suite, it needs to be run after the Trio k-mer Profiling workflow VGP2.
Inputs
- Hifi long reads [fastq]
- Concatenated Illumina reads : Paternal [fastq]
- Concatenated Illumina reads : Maternal [fastq]
- K-mer database [meryldb] generated by VGP2 workflow.
- Paternal hapmer database [meryldb] generated by VGP2 workflow.
...
This workflow perform the scaffolding of a genome assemble using HiC data with YAHS. Part of the VGP set of workflows.
Purge contigs marked as duplicates by purge_dups (could be haplotypic duplication or overlap duplication). This workflow is the 6th workflow of the VGP pipeline. It is meant to be run after one of the contigging steps (Workflow 3, 4, or 5)
Purge duplicates from one haplotype. Prerequisites: run after a k-mer profiling workflow (VGP 1 or 2) and a contiging workflow (VGP 3,4 or 5).
Contiging Solo:
Generate assembly based on PacBio Hifi Reads.
Inputs
- Hifi long reads [fastq]
- K-mer database [meryldb]
- Genome profile summary generated by Genomescope [txt]
- Homozygous Read Coverage. Optional, use if you think the estimation from Genomescope is inacurate.
- Genomescope Model Parameters generated by Genomescope [tabular]
- Database for busco lineage (recommended: latest)
- Busco lineage (recommended: vertebrata)
- Name of first assembly
- Name of second ...
Generate Nx and Size plot for multiple assemblies
Inputs
Collection of fasta files. The name of each item in the collection will be used as label for the Nx and Size plots.
Outputs
- Nx plot
- Size plot
Microbiome - Variant calling and Consensus Building
Build a consensus sequence from FILTER PASS variants with intrasample allele-frequency above a configurable consensus threshold. Hard-mask regions with low coverage (but not consensus variants within them) and ambiguous sites.