Post-genome assembly quality control workflow using Quast, BUSCO, Meryl, Merqury and Fasta Statistics. Updates November 2023. Inputs: reads as fastqsanger.gz (not fastq.gz), and assembly.fasta. New default settings for BUSCO: lineage = eukaryota; for Quast: lineage = eukaryotes, genome = large. Reports assembly stats into a table called metrics.tsv, including selected metrics from Fasta Stats, and read coverage; reports BUSCO versions and dependencies; and displays these tables in the workflow report. Note: a known bug is that sometimes the workflow report text resets to default text. To restore, look for an earlier workflow version with correct workflow report text, and copy and paste report text into current version.
Inputs
ID | Name | Description | Type |
---|---|---|---|
FASTA contigs - Primary Assembly | #main/FASTA contigs - Primary Assembly | n/a |
|
Raw reads | #main/Raw reads | n/a |
|
Steps
ID | Name | Description |
---|---|---|
2 | FASTQ to FASTA | toolshed.g2.bx.psu.edu/repos/devteam/fastqtofasta/fastq_to_fasta_python/1.1.5 |
3 | Meryl | toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6 |
4 | Fasta Statistics | toolshed.g2.bx.psu.edu/repos/iuc/fasta_stats/fasta-stats/2.0 |
5 | Quast | toolshed.g2.bx.psu.edu/repos/iuc/quast/quast/5.0.2+galaxy1 |
6 | Busco | toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/5.4.6+galaxy0 |
7 | Fasta Statistics | toolshed.g2.bx.psu.edu/repos/iuc/fasta_stats/fasta-stats/2.0 |
8 | Merqury | toolshed.g2.bx.psu.edu/repos/iuc/merqury/merqury/1.3 |
9 | Search in textfiles | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1 |
10 | Relabel some items in Fasta stats | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1 |
11 | Get required Busco stats | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1 |
12 | Get Busco version | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1 |
13 | Get Busco dependencies | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1 |
14 | Search in textfiles | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1 |
15 | Cut | Cut1 |
16 | Filter out unneeded lines from fasta stats | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0 |
17 | Rename some items and add in delimiters for later | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1 |
18 | Reformat some text | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1 |
19 | Cut | Cut1 |
20 | Extract assembly size | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0 |
21 | Extract number of contigs | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0 |
22 | Extract Contig N and L 50s and 90s | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0 |
23 | Extract longest contig | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0 |
24 | Extract GC content | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0 |
25 | Convert commas to tabs | Convert characters1 |
26 | Collate Busco info | cat1 |
27 | Paste | Paste1 |
28 | Add blank header | toolshed.g2.bx.psu.edu/repos/bgruening/add_line_to_file/add_line_to_file/0.1.0 |
29 | Transpose cols to rows | toolshed.g2.bx.psu.edu/repos/iuc/datamash_transpose/datamash_transpose/1.8+galaxy0 |
30 | Convert to table | Convert characters1 |
31 | Compute coverage, total reads length divided by assembly length | toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.0 |
32 | Convert underscores to tabs | Convert characters1 |
33 | Keep two columns | Cut1 |
34 | Round the percentage to 2 decimal places | toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.0 |
35 | Label the column | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_column/1.1.3 |
36 | Join info into one table | cat1 |
Outputs
ID | Name | Description | Type |
---|---|---|---|
Busco and dependencies version | #main/Busco and dependencies version | n/a |
|
Busco on input dataset(s): full table | #main/Busco on input dataset(s): full table | n/a |
|
Fasta Statistics on input dataset(s): summary stats | #main/Fasta Statistics on input dataset(s): summary stats | n/a |
|
Genome assembly metrics | #main/Genome assembly metrics | n/a |
|
Genome coverage | #main/Genome coverage | n/a |
|
Merqury on input dataset(s): bed | #main/Merqury on input dataset(s): bed | n/a |
|
Merqury on input dataset(s): png | #main/Merqury on input dataset(s): png | n/a |
|
Merqury on input dataset(s): qv | #main/Merqury on input dataset(s): qv | n/a |
|
Merqury on input dataset(s): size files | #main/Merqury on input dataset(s): size files | n/a |
|
Merqury on input dataset(s): stats | #main/Merqury on input dataset(s): stats | n/a |
|
Merqury on input dataset(s): wig | #main/Merqury on input dataset(s): wig | n/a |
|
Meryl on input dataset(s): read-db.meryldb | #main/Meryl on input dataset(s): read-db.meryldb | n/a |
|
Quast on input dataset(s): HTML report | #main/Quast on input dataset(s): HTML report | n/a |
|
Quast on input dataset(s): PDF report | #main/Quast on input dataset(s): PDF report | n/a |
|
Quast on input dataset(s): Log | #main/Quast on input dataset(s): Log | n/a |
|
Quast on input dataset(s): tabular report | #main/Quast on input dataset(s): tabular report | n/a |
|
_anonymous_output_1 | #main/_anonymous_output_1 | n/a |
|
_anonymous_output_10 | #main/_anonymous_output_10 | n/a |
|
_anonymous_output_11 | #main/_anonymous_output_11 | n/a |
|
_anonymous_output_12 | #main/_anonymous_output_12 | n/a |
|
_anonymous_output_13 | #main/_anonymous_output_13 | n/a |
|
_anonymous_output_14 | #main/_anonymous_output_14 | n/a |
|
_anonymous_output_15 | #main/_anonymous_output_15 | n/a |
|
_anonymous_output_16 | #main/_anonymous_output_16 | n/a |
|
_anonymous_output_17 | #main/_anonymous_output_17 | n/a |
|
_anonymous_output_18 | #main/_anonymous_output_18 | n/a |
|
_anonymous_output_19 | #main/_anonymous_output_19 | n/a |
|
_anonymous_output_2 | #main/_anonymous_output_2 | n/a |
|
_anonymous_output_20 | #main/_anonymous_output_20 | n/a |
|
_anonymous_output_21 | #main/_anonymous_output_21 | n/a |
|
_anonymous_output_22 | #main/_anonymous_output_22 | n/a |
|
_anonymous_output_23 | #main/_anonymous_output_23 | n/a |
|
_anonymous_output_24 | #main/_anonymous_output_24 | n/a |
|
_anonymous_output_25 | #main/_anonymous_output_25 | n/a |
|
_anonymous_output_26 | #main/_anonymous_output_26 | n/a |
|
_anonymous_output_3 | #main/_anonymous_output_3 | n/a |
|
_anonymous_output_4 | #main/_anonymous_output_4 | n/a |
|
_anonymous_output_5 | #main/_anonymous_output_5 | n/a |
|
_anonymous_output_6 | #main/_anonymous_output_6 | n/a |
|
_anonymous_output_7 | #main/_anonymous_output_7 | n/a |
|
_anonymous_output_8 | #main/_anonymous_output_8 | n/a |
|
_anonymous_output_9 | #main/_anonymous_output_9 | n/a |
|
out_file1 | #main/out_file1 | n/a |
|
outfile | #main/outfile | n/a |
|
Version History
Version 2 (latest) Created 6th Aug 2024 at 10:57 by Johan Gustafsson
Added/updated 9 files
Open
master
7ca9943
Version 1 (earliest) Created 13th Mar 2024 at 23:32 by Johan Gustafsson
Added/updated 9 files
Frozen
Version-1
178e9ce
Creators
Submitter
Views: 2153 Downloads: 376 Runs: 0
Created: 13th Mar 2024 at 23:32
None