Workflow Type: Galaxy

Purge Duplicate Contigs

Purge contigs marked as duplicates by purge_dups in a single haplotype(could be haplotypic duplication or overlap duplication) This workflow is the 6th workflow of the VGP pipeline. It is meant to be run after one of the contigging steps (Workflow 3, 4, or 5)

Inputs

  1. Genomescope model parameters [txt] (Generated by the k-mer profiling workflow)
  2. Hifi long reads - trimmed [fastq] (Generated by Cutadapt in the contigging workflow)
  3. Assembly to purge (e.g. hap1) [fasta] (Generated by the contigging workflow)
  4. K-mer database [meryldb] (Generated by the k-mer profiling workflow)
  5. Estimated Genome Size [txt]
  6. Assembly to leave alone (used for merqury statistics) (e.g. hap2) [fasta] (Generated by the contigging workflow)
  7. Name of un-altered assembly
  8. Name of purged assembly

Outputs

  1. Haplotype 1 purged assembly (Fasta and gfa)
  2. Haplotype 2 purged assembly (Fasta and gfa)
  3. QC: BUSCO report for both assemblies
  4. QC: Merqury report for both assemblies
  5. QC: Assembly statistics for both assemblies
  6. QC: Nx plot for both assemblies
  7. QC: Size plot for both assemblies

Inputs

ID Name Description Type
Assembly to leave alone (need this for merqury) Assembly to leave alone (need this for merqury) n/a
  • File
Assembly to purge Assembly to purge n/a
  • File
Estimated genome size - Parameter File Estimated genome size - Parameter File n/a
  • File
Genomescope model parameters Genomescope model parameters n/a
  • File
Meryl Database Meryl Database n/a
  • File
Name of purged assembly Name of purged assembly n/a
  • string?
Name of un-altered assembly Name of un-altered assembly n/a
  • string?
Pacbio Reads Collection - Trimmed Pacbio Reads Collection - Trimmed n/a
  • File[]

Steps

ID Name Description
8 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.0
9 Map with minimap2 toolshed.g2.bx.psu.edu/repos/iuc/minimap2/minimap2/2.28+galaxy0
10 Purge overlaps toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/1.2.6+galaxy0
11 Estimated genome size param_value_from_file
12 gfastats toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0
13 Cut Cut1
14 Cut Cut1
15 Map with minimap2 toolshed.g2.bx.psu.edu/repos/iuc/minimap2/minimap2/2.28+galaxy0
16 gfastats toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0
17 gfastats_data_prep n/a
18 Parse parameter value param_value_from_file
19 Parse parameter value param_value_from_file
20 Text reformatting toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/9.3+galaxy1
21 Purge overlaps toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/1.2.6+galaxy0
22 Purge overlaps toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/1.2.6+galaxy0
23 Remove REPEATs from BED toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/9.3+galaxy1
24 Purge overlaps toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/1.2.6+galaxy0
25 Merqury toolshed.g2.bx.psu.edu/repos/iuc/merqury/merqury/1.3+galaxy4
26 gfastats toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0
27 Busco toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/5.5.0+galaxy0
28 gfastats toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0
29 Convert purged fasta to gfa toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0
30 gfastats_data_prep n/a
31 Text reformatting toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/9.3+galaxy1
32 gfastats_plot n/a
33 Join toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_easyjoin_tool/9.3+galaxy1

Outputs

ID Name Description Type
Read Coverage and cutoffs calculation: Histogram plot Read Coverage and cutoffs calculation: Histogram plot n/a
  • File
Cutoffs Cutoffs n/a
  • File
Purged assembly Purged assembly n/a
  • File
Removed haplotigs Removed haplotigs n/a
  • File
Merqury on Phased assemblies: stats Merqury on Phased assemblies: stats n/a
  • File
Merqury on Phased assemblies: Images Merqury on Phased assemblies: Images n/a
  • File
Busco on Purged Primary assembly: short summary Busco on Purged Primary assembly: short summary n/a
  • File
Busco on Purged Primary assembly: summary image Busco on Purged Primary assembly: summary image n/a
  • File
Purged assembly statistics Purged assembly statistics n/a
  • File
Purged assembly (GFA) Purged assembly (GFA) n/a
  • File
Nx Plot Nx Plot n/a
  • File
Size Plot Size Plot n/a
  • File
Assembly statistics Assembly statistics n/a
  • File

Version History

v0.5 (latest) Created 23rd Apr 2024 at 03:01 by WorkflowHub Bot

Updated to v0.5


Frozen v0.5 47753b0

v0.4 Created 27th Mar 2024 at 03:02 by WorkflowHub Bot

Updated to v0.4


Frozen v0.4 32c3b9b

v0.3 Created 7th Mar 2024 at 03:02 by WorkflowHub Bot

Updated to v0.3


Frozen v0.3 31f46a9

v0.1 (earliest) Created 15th Feb 2024 at 03:01 by WorkflowHub Bot

Updated to v0.1


Frozen v0.1 49773bd
help Creators and Submitter
Creators
Not specified
Additional credit

Galaxy, VGP

Submitter
Activity

Views: 607

Created: 15th Feb 2024 at 03:01

Last updated: 23rd Apr 2024 at 03:01

help Attributions

None

Total size: 142 KB
Powered by
(v.1.14.1)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH