The SARS-CoV-2 Data Hubs form one of the three components of the European COVID-19 Data Platform. Using technology that builds upon existing EMBL-EBI infrastructure, we provide SARS-CoV-2 Data Hubs to those public health agencies and other scientific groups responsible for generating viral sequence data from the outbreak at national or regional levels. SARS-CoV-2 Data Hubs offer a variety of configurations in terms of upload tools, data processing and analysis workflows and data visualisations.
A workflow for mapping and consensus generation of SARS-CoV2 whole genome amplicon nanopore data implemented in the Nextflow framework. Reads are mapped to a reference genome using Minimap2 after trimming the amplicon primers with a fixed length at both ends of the amplicons using Cutadapt. The consensus is called using Pysam based on a majority read support threshold per position of the Minimap2 alignment and positions with less than 30x coverage are masked using ‘N’.