Workflows
What is a Workflow?Filters
The workflow starts with a gene set created from Example gene set. CTD is applied which diffuses through all nodes in STRING[1] to identify nodes that are "guilty by association" and highly connected to the initial gene set of interest[2][3]. A list of Highly Connected Genes was obtained from the CTD output. A list of Guilty By Association Genes was obtained from the CTD output.
- Szklarczyk, D. et al. STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucleic ...
Type: Playbook Workflow Builder Workflow
Creator: Playbook Partnership NIH CFDE
Submitter: Daniel Clarke
The workflow starts with selecting chr2:g.39417578C>G as the search term. The closest gene to the variant was found using MyVariant.info[1]. Gene expression in tumors for CDKL4 were queried from the Open Pediatric Cancer Atlas API[3]. Median expression of CDKL4 was obtained from the GTEx Portal[4] using the portal's API. To visualize the level of expression across tumor gene expression, a bar plot was created Fig..
- Lelong, S. et al. BioThings SDK: a toolkit for building high-performance ...
Type: Playbook Workflow Builder Workflow
Creator: Playbook Partnership NIH CFDE
Submitter: Daniel Clarke
The workflow starts with selecting KLF6 as the search term. RNA-seq-like LINCS L1000 Signatures[1] which mimick or reverse the the expression of KLF6 were visualized. Median expression of KLF6 was obtained from the GTEx Portal[6] using the portal's API. To visualize the scored tissues, a vertical bar plot was created Fig..
- Evangelista, J. E. et al. SigCom LINCS: data and metadata search engine for a million gene expression signatures. Nucleic Acids Research vol. 50 W697–W709 (2022). ...
Type: Playbook Workflow Builder Workflow
Creator: Playbook Partnership NIH CFDE
Submitter: Daniel Clarke
The workflow starts with selecting atrial fibrillation as the search term. The workflow starts with selecting Ibrutinib as the search term. Gene sets with set labels containing atrial fibrillation were queried from Enrichr[1]. Identified matching terms from the MGI Mammalian Phenotype Level 4 2021[2] library were assembled into a collection of gene sets. A GMT was extracted from the Enrichr results for MGI_Mammalian_Phenotype_Level_4_2021. A consensus gene set was created by only retaining genes ...
Type: Playbook Workflow Builder Workflow
Creator: Playbook Partnership NIH CFDE
Submitter: Daniel Clarke
The workflow starts with selecting Inflammation as the search term. The workflow starts with selecting Penicillin as the search term. The workflow starts with selecting Cortisol as the search term. Gene sets with set labels containing Inflammation were queried from Enrichr[1]. Identified matching terms from the GWAS Catalog 2019[2] library were assembled into a collection of gene sets. A GMT was extracted from the Enrichr results for GWAS_Catalog_2019. All the identified gene sets were combined ...
Type: Playbook Workflow Builder Workflow
Creator: Playbook Partnership NIH CFDE
Submitter: Daniel Clarke
Installation
Other than cloning this repository, you need to have bash installed (which is most likely the case if you use Linux, *BSD or even MacOS). For the Python code, the arguably easiest and cleanest way is to set up a Python virtual environment and install the dependencies there:
$ python3 -m venv ./hcp-suite-venv # Setup the virtual environment
$ source ./hcp-suite-venv/bin/activate # Activate the virtual environment
$ pip install pandas pingouin networkx nilearn nibabel ray
...
Workflow (hybrid) metagenomic assembly and binning
- Workflow Illumina Quality: https://workflowhub.eu/workflows/336?version=1
- FastQC (control)
- fastp (quality trimming)
- kraken2 (taxonomy)
- bbmap contamination filter
- Workflow Longread Quality:
- NanoPlot (control)
- filtlong (quality trimming)
- kraken2 (taxonomy)
- minimap2 contamination filter
- Kraken2 taxonomic classification of FASTQ reads
- SPAdes/Flye (Assembly)
- Pilon/Medaka/PyPolCA (Assembly polishing)
- QUAST (Assembly ...
Type: Common Workflow Language
Creators: Bart Nijsse, Jasper Koehorst, Changlin Ke
Submitter: Bart Nijsse
ONTViSc (ONT-based Viral Screening for Biosecurity)
Introduction
eresearchqut/ontvisc is a Nextflow-based bioinformatics pipeline designed to help diagnostics of viruses and viroid pathogens for biosecurity. It takes fastq files generated from either amplicon or whole-genome sequencing using Oxford Nanopore Technologies as input.
The pipeline can either: 1) perform a direct search on the sequenced reads, 2) generate clusters, 3) assemble the reads to generate longer contigs or 4) directly ...
Type: Nextflow
Creators: Marie-Emilie Gauthier, Craig Windell, Magdalena Antczak, Roberto Barrero
Submitter: Magdalena Antczak