VGP-meryldb-creation-trio/main
Version 1

Workflow Type: Galaxy

VGP Workflow #1

This workflow collects the metrics on the properties of the genome under consideration by analyzing the k-mer frequencies. It provides information about the genomic complexity, such as the genome size and levels of heterozygosity and repeat content, as well about the data quality. It uses reads from two parental genomes to partition long reads from the offspring into haplotype-specific k-mer databases.

Inputs

  • Collection of Hifi long reads in FASTQ format
  • Paternal short-read Illumina sequencing reads in FASTQ format
  • Maternal short-read Illumina sequencing reads in FASTQ format

Outputs

  • Meryl databases of k-mer counts
    • Child
    • Paternal haplotype
    • Maternal haplotype
  • GenomeScope metrics of child and parental genomes
    • Linear plot
    • Log plot
    • Transformed linear plot
    • Transformed log plot
    • Summary
    • Model
    • Model parameteres

Inputs

ID Name Description Type
K-mer length K-mer length K-mer length used to calculate k-mer spectra. For a human genome, the best k-mer size is k=21 for both haploid (3.1G) or diploid (6.2G). n/a
Pacbio Hifi reads Pacbio Hifi reads n/a n/a
Collection of Paired Reads - Paternal Collection of Paired Reads - Paternal Collection of Paired Illumina Data in fastq format for Parent 1. n/a
Collection of Paired Reads - Maternal Collection of Paired Reads - Maternal Collection of Paired Illumina Data in fastq format for Parent 2. n/a
Ploidy Ploidy Ploidy for model to use. Default=2 n/a
operation_type operation_type runtime parameter for tool Meryl n/a
operation_type operation_type runtime parameter for tool Meryl n/a
operation_type operation_type runtime parameter for tool Meryl n/a
input input runtime parameter for tool GenomeScope n/a
input input runtime parameter for tool GenomeScope n/a
input input runtime parameter for tool GenomeScope n/a

Steps

ID Name Description
0 K-mer length K-mer length used to calculate k-mer spectra. For a human genome, the best k-mer size is k=21 for both haploid (3.1G) or diploid (6.2G).
1 Pacbio Hifi reads
2 Collection of Paired Reads - Paternal Collection of Paired Illumina Data in fastq format for Parent 1.
3 Collection of Paired Reads - Maternal Collection of Paired Illumina Data in fastq format for Parent 2.
4 Ploidy Ploidy for model to use. Default=2
5 FASTQ interlacer toolshed.g2.bx.psu.edu/repos/devteam/fastq_paired_end_interlacer/fastq_paired_end_interlacer/1.2.0.1+galaxy0
6 FASTQ interlacer toolshed.g2.bx.psu.edu/repos/devteam/fastq_paired_end_interlacer/fastq_paired_end_interlacer/1.2.0.1+galaxy0
7 Meryl toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6
8 Meryl toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6
9 Meryl toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6
10 Meryl toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6
11 GenomeScope toolshed.g2.bx.psu.edu/repos/iuc/genomescope/genomescope/2.0+galaxy1
12 Meryl toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6
13 Meryl toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6
14 Meryl toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6
15 Genomescope on paternal haplotype toolshed.g2.bx.psu.edu/repos/iuc/genomescope/genomescope/2.0+galaxy1
16 Genomescope on maternal haplotype toolshed.g2.bx.psu.edu/repos/iuc/genomescope/genomescope/2.0+galaxy1

Outputs

ID Name Description Type
outfile_pairs_from_coll outfile_pairs_from_coll n/a
outfile_singles_from_coll outfile_singles_from_coll n/a
outfile_pairs_from_coll outfile_pairs_from_coll n/a
outfile_singles_from_coll outfile_singles_from_coll n/a
read_db read_db n/a
read_db read_db n/a
read_db_hist read_db_hist n/a
pat_db pat_db n/a
pat_db_hist pat_db_hist n/a
mat_db mat_db n/a
mat_db_hist mat_db_hist n/a
read_db read_db n/a
read_db read_db n/a
linear_plot linear_plot n/a
log_plot log_plot n/a
transformed_linear_plot transformed_linear_plot n/a
transformed_log_plot transformed_log_plot n/a
model model n/a
summary summary n/a
read_db read_db n/a
read_db_hist read_db_hist n/a
read_db_hist read_db_hist n/a
linear_plot linear_plot n/a
log_plot log_plot n/a
transformed_linear_plot transformed_linear_plot n/a
transformed_log_plot transformed_log_plot n/a
model model n/a
summary summary n/a
linear_plot linear_plot n/a
log_plot log_plot n/a
transformed_linear_plot transformed_linear_plot n/a
transformed_log_plot transformed_log_plot n/a
model model n/a
summary summary n/a

Version History

v0.1 (earliest) Created 14th Jun 2022 at 03:01 by WorkflowHub Bot

Updated to v0.1


Frozen v0.1 620df6d
help Creators and Submitter
Creators
Not specified
Submitter
Activity

Views: 32

Created: 14th Jun 2022 at 03:01

Last used: 25th Jun 2022 at 19:32

help Tags

This item has not yet been tagged.

help Attributions

None

Total size: 2.2 MB
Powered by
(v.1.12.0)
Copyright © 2008 - 2022 The University of Manchester and HITS gGmbH

By continuing to use this site you agree to the use of cookies