Workflows
What is a Workflow?Filters
This workflow takes as input a SRA_manifest from SRA Run Selector and will generate one fastq file or fastq pair of file for each experiment (concatenated multiple runs if necessary). Output will be relabelled to match the column specified by the user.
A demonstration workflow for Reduced Order Modeling (ROM) within the eFlows4HPC project, implemented using Kratos Multiphysics, EZyRB, COMPSs, and dislib.
Type: COMPSs
Creators: Jose Raul Bravo Martinez, Sebastian Ares de Parga Regalado, Riccardo Rossi Bernecoli, Jorge Ejarque
Submitter: Raül Sirvent
rquest-omop-worker-workflows
Source for workflow definitions for the open source RQuest OMOP Worker tool developed for Hutch/TRE-FX
Note: ARM workflows are currently broken. x86 ones work.
Inputs
### Body Sample input payload:
{
"task_id": "job-2023-01-13-14: 20: 38-",
"project": "",
"owner": "",
"cohort": {
"groups": [
{
"rules": [
{
"varname": "OMOP",
"varcat": "Person",
"type": "TEXT",
"oper": "=",
"value": "8507"
}
],
"rules_oper": "AND"
}
],
"groups_oper": "OR"
},
"collection":
...
Type: Common Workflow Language
Creator: Vasiliki Panagi
Submitters: Stian Soiland-Reyes, Vasiliki Panagi, Jon Couldridge
Summary
The data preparation pipeline contains tasks for two distinct scenarios: leukaemia that contains microarray data for 119 patients and ovarian cancer that contains next generation sequencing data for 380 patients.
The disease outcome prediction pipeline offers two strategies for this task:
Graph kernel method: It starts generating personalized networks for ...
Summary
This pipeline contains the following functions: (1) Data processing to handle the tansformations needed to obtain the original pathway scores of the samples according to single sample analysis GSEA (2) Model training based on the disease and healthy sample pathway scores, to classify them (3) Scoring matrix weights optimization according to a gold standard list of drugs (those that went on clinical trials or are approved for the disease).It tests the weights in a range of 0 to 30 (you ...
Summary
The PPI information aggregation pipeline starts getting all the datasets in GEO database whose material was generated using expression profiling by high throughput sequencing. From each database identifiers, it extracts the supplementary files that had the counts table. Once finishing the download step, it identifies those that were normalized or had the raw counts to normalize. It also identify and map the gene ids to uniprot (the ids found usually ...
Summary
This pipeline has as major goal provide a tool for protein interactions (PPI) prediction data formalization and standardization using the OntoPPI ontology. This pipeline is splitted in two parts: (i) a part to prepare data from three main sources of PPI data (HINT, STRING and PredPrin) and create the standard files to be processed ...
Summary
The validation process proposed has two pipelines for filtering PPIs predicted by some IN SILICO detection method, both pipelines can be executed separately. The first pipeline (i) filter according to association rules of cellular locations extracted from HINT database. The second pipeline (ii) filter according to scientific papers where both proteins in the PPIs appear in interaction context in the sentences.
The pipeline (i) starts extracting cellular component annotations from ...
Summary
PredPrIn is a scientific workflow to predict Protein-Protein Interactions (PPIs) using machine learning to combine multiple PPI detection methods of proteins according to three categories: structural, based on primary aminoacid sequence and functional annotations.
PredPrIn contains three main steps: (i) acquirement and treatment of protein information, (ii) feature generation, and (iii) classification and analysis.
(i) The first step builds a knowledge base with the available annotations ...
Run baredSC in 1 dimension in logNorm for 1 to N gaussians and combine models.