Workflow Type: Galaxy
Frozen
This workflow performs the scaffolding of a genome assembly using HiC data with YAHS. Can be used on any assembly with Hi-C data, and the assembly in the gfa format. You can generate a gfa from a fasta using the gfastat tool. Part of the VGP set of workflows, it is meant to be run after the contigging (workflows 3,4, or 5), optional purging step (Workflow 6 or 6b), and an optionnal scaffolding with Bionano data (Workflow 7). This workflow includes QC with Assembly statistics, Busco, and Hi-C maps.
Inputs
| ID | Name | Description | Type |
|---|---|---|---|
| Assembly Name | Assembly Name | For Workflow report. |
|
| Database for Busco Lineage | Database for Busco Lineage | Select the database to use for Busco lineage. |
|
| Estimated genome size - Parameter File | Estimated genome size - Parameter File | Estimated genome size from contiging workflow. |
|
| Haplotype | Haplotype | Select the haplotype being scaffolded. |
|
| Hi-C reads | Hi-C reads | n/a |
|
| Input GFA | Input GFA | The input GFA must conform to gfa1.2 standards, i.e. should have 'P' lines defined, and contain sequences. Output GFAs from assemblers can be run through a GFA-GFA conversion using gfastats to ensure this. |
|
| Lineage | Lineage | Taxonomic lineage for the organism being assembled for Busco analysis. |
|
| Restriction enzymes | Restriction enzymes | Restriction enzymes used in preparation of Hi-C libraries. |
|
| Species Name | Species Name | For Workflow report. |
|
| Trim Hi-C Data? | Trim Hi-C Data? | Trim 5 bases at the beginning of each read. Use with Arima Hi-C data if the Hi-C map looks "noisy". (Select No if you are using the trimmed data generated by a previous workflow) |
|
Steps
| ID | Name | Description |
|---|---|---|
| 10 | Species Name: | toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1 |
| 11 | Assembly Name: | toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1 |
| 12 | gfastats | toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.11+galaxy0 |
| 13 | Compose text parameter value | toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1 |
| 14 | Haplotype: | toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1 |
| 15 | Lineage used for Busco | toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1 |
| 16 | Parse parameter value | param_value_from_file |
| 17 | Trim and Align Hi-C paired collection | n/a |
| 18 | YAHS | toolshed.g2.bx.psu.edu/repos/iuc/yahs/yahs/1.2a.2+galaxy2 |
| 19 | PretextMap | toolshed.g2.bx.psu.edu/repos/iuc/pretext_map/pretext_map/0.1.9+galaxy1 |
| 20 | Replace | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/9.5+galaxy0 |
| 21 | Pretext Snapshot | toolshed.g2.bx.psu.edu/repos/iuc/pretext_snapshot/pretext_snapshot/0.0.4+galaxy0 |
| 22 | gfastats | toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.11+galaxy0 |
| 23 | gfastats | toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.11+galaxy0 |
| 24 | Extract dataset | __EXTRACT_DATASET__ |
| 25 | gfastats | toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.11+galaxy0 |
| 26 | gfastats | toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.11+galaxy0 |
| 27 | gfastats | toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.11+galaxy0 |
| 28 | BWA-MEM2 | toolshed.g2.bx.psu.edu/repos/iuc/bwa_mem2/bwa_mem2/2.2.1+galaxy4 |
| 29 | Busco | toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/5.8.0+galaxy1 |
| 30 | gfastats_data_prep | n/a |
| 31 | Replace | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/9.5+galaxy0 |
| 32 | Samtools merge | toolshed.g2.bx.psu.edu/repos/iuc/samtools_merge/samtools_merge/1.20+galaxy2 |
| 33 | Cut | Cut1 |
| 34 | Cut | Cut1 |
| 35 | bedtools BAM to BED | toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_bamtobed/2.31.1+galaxy0 |
| 36 | PretextMap | toolshed.g2.bx.psu.edu/repos/iuc/pretext_map/pretext_map/0.1.9+galaxy1 |
| 37 | Nx Plot | toolshed.g2.bx.psu.edu/repos/iuc/ggplot2_point/ggplot2_point/3.4.0+galaxy1 |
| 38 | Size Plot | toolshed.g2.bx.psu.edu/repos/iuc/ggplot2_point/ggplot2_point/3.4.0+galaxy1 |
| 39 | Sort | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sort_header_tool/9.5+galaxy0 |
| 40 | Pretext Snapshot | toolshed.g2.bx.psu.edu/repos/iuc/pretext_snapshot/pretext_snapshot/0.0.4+galaxy0 |
| 41 | Extract dataset | __EXTRACT_DATASET__ |
Outputs
| ID | Name | Description | Type |
|---|---|---|---|
| Species Name for report | Species Name for report | n/a |
|
| Assembly for report | Assembly for report | n/a |
|
| Haplotype for report | Haplotype for report | n/a |
|
| Lineage for report | Lineage for report | n/a |
|
| s1 Merged Hi-C Alignments | s1 Merged Hi-C Alignments | n/a |
|
| s1 Trimmed Hi-C data | s1 Trimmed Hi-C data | n/a |
|
| s1 Hi-C Alignments | s1 Hi-C Alignments | n/a |
|
| YAHS on input dataset(s): Final scaffolds agp output | YAHS on input dataset(s): Final scaffolds agp output | n/a |
|
| Suffixed AGP | Suffixed AGP | n/a |
|
| Reconciliated Scaffolds: gfa no sequence | Reconciliated Scaffolds: gfa no sequence | n/a |
|
| Reconciliated Scaffolds: gfa | Reconciliated Scaffolds: gfa | n/a |
|
| Pretext Map Before HiC scaffolding | Pretext Map Before HiC scaffolding | n/a |
|
| Reconciliated Scaffolds: fasta | Reconciliated Scaffolds: fasta | n/a |
|
| Scaffold sizes for s2 | Scaffold sizes for s2 | n/a |
|
| Assembly Statistics for s2 | Assembly Statistics for s2 | n/a |
|
| Hi-C Alignments | Hi-C Alignments | n/a |
|
| Busco Summary | Busco Summary | n/a |
|
| Busco Summary image | Busco Summary image | n/a |
|
| clean_stats | clean_stats | n/a |
|
| s2 Merged Hi-C Alignments | s2 Merged Hi-C Alignments | n/a |
|
| Nx Plot | Nx Plot | n/a |
|
| Size Plot | Size Plot | n/a |
|
| Pretext Map After HiC scaffolding | Pretext Map After HiC scaffolding | n/a |
|
Version History
v0.1 (earliest) Created 27th Oct 2023 at 03:01 by WorkflowHub Bot
Updated to v0.1
Frozen
v0.1
95a269d
Creators and SubmitterCreators
Not specifiedAdditional credit
VGP, Galaxy
Submitter
Activity
Views: 27338 Downloads: 148741 Runs: 10
Created: 27th Oct 2023 at 03:01
Last updated: 12th Mar 2026 at 03:01
Annotated Properties
Scientific disciplines
Computer Science
TagsThis item has not yet been tagged.
AttributionsNone
View on GitHub
Run on Galaxy