Post-genome assembly quality control workflow using Quast, BUSCO, Meryl, Merqury and Fasta Statistics. Updates November 2023.
- Inputs: reads as fastqsanger.gz (not fastq.gz), and assembly.fasta. (To change format: click on the pencil icon next to the file in the Galaxy history, then "Datatypes", then set "New type" as fastqsanger.gz).
- New default settings for BUSCO: lineage = eukaryota; for Quast: lineage = eukaryotes, genome = large.
- Reports assembly stats into a table called metrics.tsv, including selected metrics from Fasta Stats, and read coverage; reports BUSCO versions and dependencies; and displays these tables in the workflow report.
- Note: a known bug is that sometimes the workflow report text resets to default text.
- To restore: open the workflow in Galaxy for editing.
- Click on the "Edit Report" icon
- Copy and paste the following text into the workflow report, then exit and save.
# Workflow Execution Report
Workflow name: Genome assessment post assembly
## Genome assembly metrics
Selected statistics from the workflow outputs. Additional metrics are available in other outputs in the history.
```galaxy
history_dataset_display(output="Genome assembly metrics")
```
## Software
Busco version and dependencies:
```galaxy
history_dataset_display(output="Busco and dependencies version")
```
## Galaxy Australia
Thanks for using Galaxy! When you use Galaxy Australia to support your publication or project, please acknowledge its use with the following statement: "This work is supported by Galaxy Australia, a service provided by the Australian Biocommons and its partners. The service receives NCRIS funding through Bioplatforms Australia and the Australian Research Data Commons (https://doi.org/10.47486/PL105), as well as The University of Melbourne and Queensland Government RICF funding."
Inputs
ID | Name | Description | Type |
---|---|---|---|
FASTA contigs - Primary Assembly | #main/FASTA contigs - Primary Assembly | n/a |
|
Raw reads | #main/Raw reads | n/a |
|
Steps
ID | Name | Description |
---|---|---|
2 | FASTQ to FASTA | toolshed.g2.bx.psu.edu/repos/devteam/fastqtofasta/fastq_to_fasta_python/1.1.5 |
3 | Meryl | toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6 |
4 | Fasta Statistics | toolshed.g2.bx.psu.edu/repos/iuc/fasta_stats/fasta-stats/2.0 |
5 | Quast | toolshed.g2.bx.psu.edu/repos/iuc/quast/quast/5.0.2+galaxy1 |
6 | Busco | toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/5.4.6+galaxy0 |
7 | Fasta Statistics | toolshed.g2.bx.psu.edu/repos/iuc/fasta_stats/fasta-stats/2.0 |
8 | Merqury | toolshed.g2.bx.psu.edu/repos/iuc/merqury/merqury/1.3 |
9 | Search in textfiles | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1 |
10 | Relabel some items in Fasta stats | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1 |
11 | Get required Busco stats | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1 |
12 | Get Busco version | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1 |
13 | Get Busco dependencies | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1 |
14 | Search in textfiles | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1 |
15 | Cut | Cut1 |
16 | Filter out unneeded lines from fasta stats | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0 |
17 | Rename some items and add in delimiters for later | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1 |
18 | Reformat some text | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1 |
19 | Cut | Cut1 |
20 | Extract assembly size | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0 |
21 | Extract number of contigs | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0 |
22 | Extract Contig N and L 50s and 90s | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0 |
23 | Extract longest contig | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0 |
24 | Extract GC content | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0 |
25 | Convert commas to tabs | Convert characters1 |
26 | Collate Busco info | cat1 |
27 | Paste | Paste1 |
28 | Add blank header | toolshed.g2.bx.psu.edu/repos/bgruening/add_line_to_file/add_line_to_file/0.1.0 |
29 | Transpose cols to rows | toolshed.g2.bx.psu.edu/repos/iuc/datamash_transpose/datamash_transpose/1.8+galaxy0 |
30 | Convert to table | Convert characters1 |
31 | Compute coverage, total reads length divided by assembly length | toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.0 |
32 | Convert underscores to tabs | Convert characters1 |
33 | Keep two columns | Cut1 |
34 | Round the percentage to 2 decimal places | toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.0 |
35 | Label the column | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_column/1.1.3 |
36 | Join info into one table | cat1 |
Outputs
ID | Name | Description | Type |
---|---|---|---|
Busco and dependencies version | #main/Busco and dependencies version | n/a |
|
Busco on input dataset(s): full table | #main/Busco on input dataset(s): full table | n/a |
|
Fasta Statistics on input dataset(s): summary stats | #main/Fasta Statistics on input dataset(s): summary stats | n/a |
|
Genome assembly metrics | #main/Genome assembly metrics | n/a |
|
Genome coverage | #main/Genome coverage | n/a |
|
Merqury on input dataset(s): bed | #main/Merqury on input dataset(s): bed | n/a |
|
Merqury on input dataset(s): png | #main/Merqury on input dataset(s): png | n/a |
|
Merqury on input dataset(s): qv | #main/Merqury on input dataset(s): qv | n/a |
|
Merqury on input dataset(s): size files | #main/Merqury on input dataset(s): size files | n/a |
|
Merqury on input dataset(s): stats | #main/Merqury on input dataset(s): stats | n/a |
|
Merqury on input dataset(s): wig | #main/Merqury on input dataset(s): wig | n/a |
|
Meryl on input dataset(s): read-db.meryldb | #main/Meryl on input dataset(s): read-db.meryldb | n/a |
|
Quast on input dataset(s): HTML report | #main/Quast on input dataset(s): HTML report | n/a |
|
Quast on input dataset(s): PDF report | #main/Quast on input dataset(s): PDF report | n/a |
|
Quast on input dataset(s): Log | #main/Quast on input dataset(s): Log | n/a |
|
Quast on input dataset(s): tabular report | #main/Quast on input dataset(s): tabular report | n/a |
|
_anonymous_output_1 | #main/_anonymous_output_1 | n/a |
|
_anonymous_output_10 | #main/_anonymous_output_10 | n/a |
|
_anonymous_output_11 | #main/_anonymous_output_11 | n/a |
|
_anonymous_output_12 | #main/_anonymous_output_12 | n/a |
|
_anonymous_output_13 | #main/_anonymous_output_13 | n/a |
|
_anonymous_output_14 | #main/_anonymous_output_14 | n/a |
|
_anonymous_output_15 | #main/_anonymous_output_15 | n/a |
|
_anonymous_output_16 | #main/_anonymous_output_16 | n/a |
|
_anonymous_output_17 | #main/_anonymous_output_17 | n/a |
|
_anonymous_output_18 | #main/_anonymous_output_18 | n/a |
|
_anonymous_output_19 | #main/_anonymous_output_19 | n/a |
|
_anonymous_output_2 | #main/_anonymous_output_2 | n/a |
|
_anonymous_output_20 | #main/_anonymous_output_20 | n/a |
|
_anonymous_output_21 | #main/_anonymous_output_21 | n/a |
|
_anonymous_output_22 | #main/_anonymous_output_22 | n/a |
|
_anonymous_output_23 | #main/_anonymous_output_23 | n/a |
|
_anonymous_output_24 | #main/_anonymous_output_24 | n/a |
|
_anonymous_output_25 | #main/_anonymous_output_25 | n/a |
|
_anonymous_output_26 | #main/_anonymous_output_26 | n/a |
|
_anonymous_output_3 | #main/_anonymous_output_3 | n/a |
|
_anonymous_output_4 | #main/_anonymous_output_4 | n/a |
|
_anonymous_output_5 | #main/_anonymous_output_5 | n/a |
|
_anonymous_output_6 | #main/_anonymous_output_6 | n/a |
|
_anonymous_output_7 | #main/_anonymous_output_7 | n/a |
|
_anonymous_output_8 | #main/_anonymous_output_8 | n/a |
|
_anonymous_output_9 | #main/_anonymous_output_9 | n/a |
|
out_file1 | #main/out_file1 | n/a |
|
outfile | #main/outfile | n/a |
|
Version History
v2.0.5 (latest) Created 6th Aug 2024 at 11:04 by Anna Syme
Merge pull request #7 from AustralianBioCommons/supernord-workflow-name-fix
Update workflow name in ro-crate-metadata.json
Frozen
v2.0.5
fe2213b
v2.0.2 Created 16th Apr 2024 at 08:19 by Johan Gustafsson
Update .lifemonitor.yaml
Frozen
v2.0.2
4ad99a2
v1.1.0 Created 9th May 2023 at 01:59 by Johan Gustafsson
Add missing raw data input
Frozen
v1.1.0
46d8253
v1.0.0 (earliest) Created 7th Nov 2022 at 07:10 by Johan Gustafsson
Update links
Frozen
v1.0.0
efaf002
Creators
Submitter
Views: 5369 Downloads: 648 Runs: 6
Created: 7th Nov 2022 at 07:10
Last updated: 6th Aug 2024 at 11:04
None