Version 1

Workflow Type: Galaxy

This WF is based on the official Covid19-Galaxy assembly workflow as available from . It has been adapted to suit the needs of the analysis of metagenomics sequencing data. Prior to be submitted to INDSC databases, these data need to be cleaned from contaminant reads, including reads of possible human origin.

The assembly of the SARS-CoV-2 genome is performed using both the Unicycler and the SPAdes assemblers, similar to the original WV.

To facilitate the deposition of raw sequencing reads in INDSC databases, different fastq files are saved during the different steps of the WV. Which reflect different levels of stringency/filtration:

(1) Initially fastq are filtered to remove human reads. (2) Subsequently, a similarity search is performed against the reference assembly of the SARS-CoV-2 genome, to retain only SARS-CoV-2 like reads. (3) Finally, SARS-CoV-2 reads are assembled, and the bowtie2 program is used to identify (and save in the corresponding fastq files) only reads that are completely identical to the final assembly of the genome.

Any of the fastq files produced in (1), (2) or (3) are suitable for being submitted in raw reads repositories. While the files filtered according to (1) are richer and contain more data, including for example genomic sequences of different microbes living in the oral cavity; files filtered according to (3) contain only the reads that are completely identical to the final assembly. This should guarantee that any re-analysis/re-assembly of these always produce consistent and identical results. File obtained at (2) include all the reads in the sequencing reaction that had some degree of similarity with the reference SARS-CoV-2 genome, these may include subgenomic RNAs, but also polymorphic regions/variants in the case of a coinfection by multiple SARS-CoV-2 strains. Consequently, reanalysis of these data is not guarateed to produce identical and consistent results, depending on the parameters used during the assembly. However, these data contain more information.

Please feel free to comment, ask questions and/or add suggestions


ID Name Description Type
Forward reads Forward reads n/a
  • File
Reverse read Reverse read n/a
  • File


ID Name Description
2 Filter_human_reads
3 fastp
4 Any_SARS-CoV-2_reads
5 Create assemblies with Unicycler
6 SPAdes
7 Keep_identical_reads_Unicycler
8 Keep_identical_reads_SPAdes


ID Name Description Type
_anonymous_output_3 _anonymous_output_3 n/a
  • File
_anonymous_output_4 _anonymous_output_4 n/a
  • File
_anonymous_output_5 _anonymous_output_5 n/a
  • File
_anonymous_output_6 _anonymous_output_6 n/a
  • File
_anonymous_output_7 _anonymous_output_7 n/a
  • File
_anonymous_output_8 _anonymous_output_8 n/a
  • File
_anonymous_output_9 _anonymous_output_9 n/a
  • File
_anonymous_output_10 _anonymous_output_10 n/a
  • File
_anonymous_output_11 _anonymous_output_11 n/a
  • File
_anonymous_output_12 _anonymous_output_12 n/a
  • File
_anonymous_output_13 _anonymous_output_13 n/a
  • File
_anonymous_output_14 _anonymous_output_14 n/a
  • File
_anonymous_output_15 _anonymous_output_15 n/a
  • File
_anonymous_output_16 _anonymous_output_16 n/a
  • File
_anonymous_output_17 _anonymous_output_17 n/a
  • File
_anonymous_output_18 _anonymous_output_18 n/a
  • File
_anonymous_output_19 _anonymous_output_19 n/a
  • File
_anonymous_output_20 _anonymous_output_20 n/a
  • File

Version History

Version 1 (earliest) Created 4th Nov 2020 at 18:35 by Matteo Chiara

Added/updated 2 files

Open master 238f7dd
help Creators and Submitter
Not specified

Views: 2663   Downloads: 316

Created: 4th Nov 2020 at 18:35

Last updated: 5th Nov 2020 at 07:42

help Tags
help Attributions


Total size: 387 KB
Powered by
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH