Workflows
What is a Workflow?Filters
Contiging Solo w/HiC:
Generate phased assembly based on PacBio Hifi Reads using HiC data from the same individual for phasing.
Inputs
- Hifi long reads [fastq]
- HiC forward reads (if multiple input files, concatenated in same order as reverse reads) [fastq]
- HiC reverse reads (if multiple input files, concatenated in same order as forward reads) [fastq]
- K-mer database [meryldb]
- Genome profile summary generated by Genomescope [txt]
- Name of first assembly
- Name of second ...
Name: Matrix multiplication with Files, reproducibility example, without data persistence Contact Person: support-compss@bsc.es Access Level: public License Agreement: Apache2 Platform: COMPSs
Description
Matrix multiplication is a binary operation that takes a pair of matrices and produces another matrix.
If A is an n×m matrix and B is an m×p matrix, the result AB of their multiplication is an n×p matrix defined only if the number of columns m in A is equal to the number ...
Name: Matrix multiplication with Files, reproducibility example Contact Person: support-compss@bsc.es Access Level: public License Agreement: Apache2 Platform: COMPSs
Description
Matrix multiplication is a binary operation that takes a pair of matrices and produces another matrix.
If A is an n×m matrix and B is an m×p matrix, the result AB of their multiplication is an n×p matrix defined only if the number of columns m in A is equal to the number of rows m in B. When multiplying ...
Purge contigs marked as duplicates by purge_dups (could be haplotypic duplication or overlap duplication). This workflow is the 6th workflow of the VGP pipeline. It is meant to be run after one of the contigging steps (Workflow 3, 4, or 5)
Contiging Solo w/HiC:
Generate phased assembly based on PacBio Hifi Reads using HiC data from the same individual for phasing.
Inputs
- Hifi long reads [fastq]
- HiC forward reads (if multiple input files, concatenated in same order as reverse reads) [fastq]
- HiC reverse reads (if multiple input files, concatenated in same order as forward reads) [fastq]
- K-mer database [meryldb]
- Genome profile summary generated by Genomescope [txt]
- Name of first assembly
- Name of second ...
ProGFASTAGen
The ProGFASTAGen (Protein-Graph-FASTA-Generator or ProtGraph-FASTA-Generator) repository contains workflows to generate so-called precursor-specific-FASTAs (using the precursors from MGF-files) including feature-peptides, like VARIANTs or CONFLICTs if desired, or global-FASTAs (as described in ProtGraph). The single workflow scripts have been implemented with Nextflow-DSL-2 ...
Parabricks-Genomics-nf is a GPU-enabled pipeline for alignment and germline short variant calling for short read sequencing data. The pipeline utilises NVIDIA's Clara Parabricks toolkit to dramatically speed up the execution of best practice bioinformatics tools. Currently, this pipeline is configured specifically for NCI's Gadi HPC.
NVIDIA's Clara Parabricks can deliver a significant ...
Library curation BOLD
This repository contains scripts and synonymy data for pipelining the automated curation of BOLD data dumps in BCDM TSV ...
Type: Snakemake
Creators: Rutger Vos, Fabian Deister, Ben Price, Special thanks to Sujeevan Ratnasingham and the team at CBG for the creation of the BCDM data exchange format that this pipeline operates on
Submitter: Rutger Vos
Scaffolding using HiC data with YAHS.
The input to this workflow is a data matrix of gene expression that was collected from a pediatric patient tumor patient from the KidsFirst Common Fund program [1]. The RNA-seq samples are the columns of the matrix, and the rows are the raw expression gene count for all human coding genes (Table 1). This data matrix is fed into TargetRanger [2] to screen for targets which are highly expressed in the tumor but lowly expressed across most healthy human tissues based on gene expression data collected ...