SEEK ID: https://workflowhub.eu/people/112
Location: Not specified
ORCID: Not specified
Joined: 12th Mar 2021
Expertise: Not specified
Tools: Not specified
Related items
Biodiversity Genomics Europe, funded by Horizon Europe call HORIZON-CL6-2021-BIODIV-01-01, aims at aligning the resources and research agendas of both DNA barcoding and reference genome generation, thus opening the door for a true quantum leap in biodiversity genomics research in Europe.
Despite ground-breaking developments in both DNA barcoding and full genome sequencing, there remains a critical need to develop and strengthen functioning communities of practice ...
Teams: Vertebrate Genomes Pipelines in Galaxy, Biodiversity Genomics Europe (general)
Web page: https://biodiversitygenomics.eu/
The Vertebrate Genomes Pipelines in Galaxy are intended to allow a user to generate high-quality near error-free assemblies of species from a user's own data or from the GenomeArk database
Space: Biodiversity Genomics Europe (BGE)
Public web page: https://galaxyproject.org/projects/vgp/workflows/
Organisms: Not specified
IWC - Intergalactic Workflow Commission
Space: This Team is not associated with a Space
Public web page: https://github.com/galaxyproject/iwc
Organisms: Not specified
This workflow is composed with the XCMS tool R package (Smith, C.A. 2006) able to extract and the metaMS R package (Wehrens, R 2014) for the field of untargeted metabolomics.
MMGBSA simulation and calculation
Contiging Solo w/HiC:
Generate phased assembly based on PacBio Hifi Reads using HiC data from the same individual for phasing.
Inputs
- Hifi long reads [fastq]
- HiC forward reads (if multiple input files, concatenated in same order as reverse reads) [fastq]
- HiC reverse reads (if multiple input files, concatenated in same order as forward reads) [fastq]
- K-mer database [meryldb]
- Genome profile summary generated by Genomescope [txt]
- Name of first assembly
- Name of second ...
Scaffolding using HiC data with YAHS.
Purge contigs marked as duplicates by purge_dups (could be haplotypic duplication or overlap duplication). This workflow is the 6th workflow of the VGP pipeline. It is meant to be run after one of the contigging steps (Workflow 3, 4, or 5)
VGP Workflow #1
This workflow produces a Meryl database and Genomescope outputs that will be used to determine parameters for following workflows, and assess the quality of genome assemblies. Specifically, it provides information about the genomic complexity, such as the genome size and levels of heterozygosity and repeat content, as well about the data quality.
Inputs
- A collection of Hifi long reads in FASTQ format
- k-mer length
- Ploidy
Outputs
- Meryl Database of kmer counts
...
Create Meryl Database used for the estimation of assembly parameters and quality control with Merqury. Part of the VGP pipeline.
Assembly with Hifi reads and Trio Data
Generate phased assembly based on PacBio Hifi Reads using parental Illumina data for phasing
Inputs
- Hifi long reads [fastq]
- Concatenated Illumina reads : Paternal [fastq]
- Concatenated Illumina reads : Maternal [fastq]
- K-mer database [meryldb]
- Paternal hapmer database [meryldb]
- Maternal hapmer database [meryldb]
- Genome profile summary generated by Genomescope [txt]
- Bloom Filter
- Name of first haplotype
- Name of second haplotype ...
This workflow is composed with the XCMS tool R package (Smith, C.A. 2006) able to extract, filter, align and fill gapand the possibility to annotate isotopes, adducts and fragments using the CAMERA R package (Kuhl, C 2012).
Scaffolding with Bionano
Scaffolding using Bionano optical map data
Inputs
- Bionano data [cmap]
- Estimated genome size [txt]
- Phased assembly generated by Hifiasm [gfa1]
Outputs
- Scaffolds
- Non-scaffolded contigs
- QC: Assembly statistics
- QC: Nx plot
- QC: Size plot
Contiging Solo w/HiC:
Generate phased assembly based on PacBio Hifi Reads using HiC data from the same individual for phasing.
Inputs
- Hifi long reads [fastq]
- HiC forward reads (if multiple input files, concatenated in same order as reverse reads) [fastq]
- HiC reverse reads (if multiple input files, concatenated in same order as forward reads) [fastq]
- K-mer database [meryldb]
- Genome profile summary generated by Genomescope [txt]
- Name of first assembly
- Name of second assembly ...
This workflow takes as input a SRA_manifest from SRA Run Selector and will generate one fastq file or fastq pair of file for each experiment (concatenated multiple runs if necessary). Output will be relabelled to match the column specified by the user.
Run baredSC in 1 dimension in logNorm for 1 to N gaussians and combine models.
Automated inference of stable isotope incorporation rates in proteins for functional metaproteomics
We assume the identifiers of the input list are like: sample_name_replicateID. The identifiers of the output list will be: sample_name
RepeatMasking Workflow
This workflow uses RepeatModeler and RepeatMasker for genome analysis.
-
RepeatModeler is a software package for identifying and modeling de novo families of transposable elements (TEs). At the heart of RepeatModeler are three de novo repeat search programs (RECON, RepeatScout and LtrHarvest/Ltr_retriever) which use complementary computational methods to identify repeat element boundaries and family relationships from sequence data.
-
RepeatMasker is a program that analyzes ...
Racon polish with long reads, x4
Downloads fastq files for sequencing run accessions provided in a text file using fasterq-dump. Creates one job per listed run accession.
This workflow takes as input SR BAM from ChIP-seq. It calls peaks on each replicate and intersect them. In parallel, each BAM is subsetted to smallest number of reads. Peaks are called using both subsets combined. Only peaks called using a combination of both subsets which have summits intersecting the intersection of both replicates will be kept.