Post-genome assembly quality control workflow using Quast, BUSCO, Meryl, Merqury and Fasta Statistics. Updates November 2023. Inputs: reads as fastqsanger.gz (not fastq.gz), and assembly.fasta. New default settings for BUSCO: lineage = eukaryota; for Quast: lineage = eukaryotes, genome = large. Reports assembly stats into a table called metrics.tsv, including selected metrics from Fasta Stats, and read coverage; reports BUSCO versions and dependencies; and displays these tables in the workflow ...
The workflow takes a trimmed HiFi reads collection, Hap1/Hap2 contigs, and the values for transition parameter and max coverage depth (calculated from WF1) to run Purge_Dups. It produces purged Hap1 and Hap2 contigs assemblies, and runs all the QC analysis (gfastats, BUSCO, and Merqury).
The workflow takes a trimmed HiFi reads collection, Forward/Reverse HiC reads, and the max coverage depth (calculated from WF1) to run Hifiasm in HiC phasing mode. It produces both Pri/Alt and Hap1/Hap2 assemblies, and runs all the QC analysis (gfastats, BUSCO, and Merqury). The default Hifiasm purge level is Light (l1).
The workflow takes a HiFi reads collection, runs FastQC and SeqKit, filters with Cutadapt, and creates a MultiQC report. The main outputs are a collection of filtred reads, a report with raw and filtered reads stats, and a table with raw reads stats.
The workflow takes a trimmed HiFi reads collection, runs Meryl to create a K-mer database, Genomescope2 to estimate genome properties and Smudgeplot to estimate ploidy. The main results are K-mer database and genome profiling plots, tables, and values useful for downstream analysis. Default K-mer length and ploidy for Genomescope are 21 and 2, respectively.
Post-genome assembly quality control workflow using Quast, BUSCO, Meryl, Merqury and Fasta Statistics. Updates November 2023.
- Inputs: reads as fastqsanger.gz (not fastq.gz), and assembly.fasta. (To change format: click on the pencil icon next to the file in the Galaxy history, then "Datatypes", then set "New type" as fastqsanger.gz).
- New default settings for BUSCO: lineage = eukaryota; for Quast: lineage = eukaryotes, genome = large.
- Reports assembly stats into a table called metrics.tsv, ...
General recommendations for using Purge-duplicates-from-hifiasm-assembly
Please see the Genome assembly with hifiasm on Galaxy Australia
The workflow & the doc_guidelines template used are supported by the Australian BioCommons via Bioplatforms Australia funding, the Australian ...
PacBio HiFi genome assembly using hifiasm v2.1
General usage recommendations
Please see the Genome assembly with hifiasm on Galaxy Australia guide.
See change log
The workflow & the doc_guidelines template used are supported by the Australian BioCommons via Bioplatforms Australia funding, the Australian ...
Collection of de-novo genome assembly workflows written for implementation in Galaxy
Input data should be PacBio HiFi reads and Illumina 3-dimensional Chromatin Confirmation Capture (HiC) reads
Executing all workflows will output two scaffolded haplotype assemblies and the complete QC analyses
Please run the workflows in order: WF0 (there are two, one for HiFi and one for Illumina HiC), WF1, WF2, WF3, WF4
Maintainers: Tom Brown, Diego De Panis
Number of items: 6
Tags: Assembly, Bioinformatics, Galaxy, Genomics, Genome assembly, HiFi, Hi-C