Assembly with Hifi reads and Trio Data
Generate phased assembly based on PacBio Hifi Reads using parental Illumina data for phasing
Inputs
- Hifi long reads [fastq]
- Concatenated Illumina reads : Paternal [fastq]
- Concatenated Illumina reads : Maternal [fastq]
- K-mer database [meryldb]
- Paternal hapmer database [meryldb]
- Maternal hapmer database [meryldb]
- Genome profile summary generated by Genomescope [txt]
- Bloom Filter
- Name of first haplotype
- Name of second haplotype ...
Contiging Solo w/HiC:
Generate phased assembly based on PacBio Hifi Reads using HiC data from the same individual for phasing.
Inputs
- Hifi long reads [fastq]
- HiC forward reads (if multiple input files, concatenated in same order as reverse reads) [fastq]
- HiC reverse reads (if multiple input files, concatenated in same order as forward reads) [fastq]
- K-mer database [meryldb]
- Genome profile summary generated by Genomescope [txt]
- Name of first assembly
- Name of second assembly ...
VGP Workflow #1
This workflow produces a Meryl database and Genomescope outputs that will be used to determine parameters for following workflows, and assess the quality of genome assemblies. Specifically, it provides information about the genomic complexity, such as the genome size and levels of heterozygosity and repeat content, as well about the data quality.
Inputs
- A collection of Hifi long reads in FASTQ format
- k-mer length
- Ploidy
Outputs
- Meryl Database of kmer counts
...
Create Meryl Database used for the estimation of assembly parameters and quality control with Merqury. Part of the VGP pipeline.
Contiging Solo w/HiC:
Generate phased assembly based on PacBio Hifi Reads using HiC data from the same individual for phasing.
Inputs
- Hifi long reads [fastq]
- HiC forward reads (if multiple input files, concatenated in same order as reverse reads) [fastq]
- HiC reverse reads (if multiple input files, concatenated in same order as forward reads) [fastq]
- K-mer database [meryldb]
- Genome profile summary generated by Genomescope [txt]
- Name of first assembly
- Name of second ...
Performs Long Read assembly using PacBio data and Hifiasm. Part of VGP assembly pipeline. This workflow generate a phased assembly.
Performs scaffolding using HiC Data. Part of VGP assembly pipeline. The scaffolding can be performed on long read assembly contigs or on scaffolds (e.g.: Bionano scaffolds).
Performs scaffolding using Bionano Data. Part of VGP assembly pipeline.
Purge Phased assembly of duplications and overlaps. Include purge steps for Primary and Alternate assemblies.
Performs Long Read assembly using PacBio data and Hifiasm. Part of VGP assembly pipeline. This workflow generate a phased assembly.
Create Meryl Database used for the estimation of assembly parameters and quality control with Merqury. Part of the VGP pipeline.
The Vertebrate Genomes Pipelines in Galaxy are intended to allow a user to generate high-quality near error-free assemblies of species from a user's own data or from the GenomeArk database.