Use Case 13: Novel Cell Surface Targets for Individual Cancer Patients Analyzed with Common Fund Datasets
Version 1

Workflow Type: Common Workflow Language
Work-in-progress

The input to this workflow is a data matrix of gene expression that was collected from a pediatric patient tumor patient from the KidsFirst Common Fund program [1]. The RNA-seq samples are the columns of the matrix, and the rows are the raw expression gene count for all human coding genes (Table 1). This data matrix is fed into TargetRanger [2] to screen for targets which are highly expressed in the tumor but lowly expressed across most healthy human tissues based on gene expression data collected from postmortem patients with RNA-seq by the GTEx Common Fund program [3]. Based on this analysis the gene IMP U3 small nucleolar ribonucleoprotein 3 (IMP3) was selected because it was the top candidate returned from the TargetRanger analysis (Tables 2-3). IMP3 is also commonly called insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3). Next, we leverage unique knowledge from various other Common Fund programs to examine various functions and knowledge related to IMP3. First, we queried the LINCS L1000 data [4] from the LINCS program [5] converted into RNA-seq-like LINCS L1000 Signatures [6] using the SigCom LINCS API [7] to identify mimicker or reverser small molecules that maximally impact the expression of IMP3 in human cell lines (Fig. 1, Table 4). In addition, we also queried the LINCS L1000 data to identify single gene CRISPR knockouts that down-regulate the expression of IMP3 (Fig. 1, Table 5). These potential drug targets were filtered using the Common Fund IDG program's list of understudied proteins [8] to produce a set of additional targets (Table 6). Next, IMP3 was searched for knowledge provided by the with the Metabolomics Workbench MetGENE tool [9]. MetGENE aggregates knowledge about pathways, reactions, metabolites, and studies from the Metabolomics Workbench Common Fund supported resource [10]. The Metabolomics Workbench was searched to find associated metabolites linked to IMP3 [10]. Furthermore, we leveraged the Linked Data Hub API [11] to list knowledge about regulatory elements associated with IMP3 (Table 6). Finally, the GlyGen database [12] was queried to identify relevant sets of proteins that are the product of the IMP3 genes, as well as known post-translational modifications discovered on IMP3.

  1. Lonsdale, J. et al. The Genotype-Tissue Expression (GTEx) project. Nature Genetics vol. 45 580–585 (2013). doi:10.1038/ng.2653
  2. Evangelista, J. E. et al. SigCom LINCS: data and metadata search engine for a million gene expression signatures. Nucleic Acids Research vol. 50 W697–W709 (2022). doi:10.1093/nar/gkac328
  3. IDG Understudied Proteins, https://druggablegenome.net/AboutIDGProteinList
  4. MetGENE, https://sc-cfdewebdev.sdsc.edu/MetGENE/metGene.php
  5. The Metabolomics Workbench, https://www.metabolomicsworkbench.org/
  6. Linked Data Hub, https://ldh.genome.network/cfde/ldh/
  7. York, W. S. et al. GlyGen: Computational and Informatics Resources for Glycoscience. Glycobiology vol. 30 72–73 (2019). doi:10.1093/glycob/cwz080

Click and drag the diagram to pan, double click or use the controls to zoom.

Inputs

ID Name Description Type
step-1-data Input File Upload a Data File
  • File
step-4-data Select One Gene Select one Gene
  • File

Steps

ID Name Description
step-1 Input File Upload a Data File
step-2 Resolve a Gene Count Matrix from a File Ensure a file contains a gene count matrix, load it into a standard format
step-3 Screen for Targets against GTEx Identify significantly overexpressed genes when compared to normal tissue in GTEx
step-4 Select One Gene Select one Gene
step-5 LINCS L1000 Reverse Search Identify RNA-seq-like LINCS L1000 Signatures which reverse the expression of the gene.
step-6 Extract Down Regulating Perturbagens Identify RNA-seq-like LINCS L1000 Chemical Perturbagen Signatures which reverse the expression of the gene.
step-7 Extract Down Regulating CRISPR KOs Identify RNA-seq-like LINCS L1000 CRISPR KO Signatures which reverse the expression of the gene.
step-8 Filter genes by Understudied Proteins Based on IDG understudied proteins list
step-9 MetGENE Search Identify gene-centric information from Metabolomics.
step-10 MetGENE Metabolites Extract Metabolomics metabolites for the gene from MetGENE
step-11 MetGENE Reactions Extract Metabolomics reactions for the gene from MetGENE
step-12 Resolve Regulatory Elements from LDH Resolve regulatory elements from gene with Linked Data Hub
step-13 Search GlyGen for Protein Products Find protein product records in GlyGen for the gene

Outputs

ID Name Description Type
step-1-output File URL URL to a File
  • File
step-2-output Gene Count Matrix A gene count matrix file
  • File
step-3-output Scored Genes ZScores of Genes
  • File
step-4-output Gene Gene Term
  • File
step-5-output LINCS L1000 Reverse Search Dashboard A dashboard for performing L1000 Reverse Search queries for a given gene
  • File
step-6-output Scored Drugs ZScores of Drugs
  • File
step-7-output Scored Genes ZScores of Genes
  • File
step-8-output Scored Genes ZScores of Genes
  • File
step-9-output MetGENE Summary A dashboard for reviewing gene-centric information for a given gene from metabolomics
  • File
step-10-output MetGENE metabolite table MetGENE metabolite table
  • File
step-11-output MetGENE Reaction Table MetGENE Reaction Table
  • File
step-12-output Regulatory Element Set Set of Regulatory Elements
  • File
step-13-output GlyGen Protein Products Protein product records in GlyGen
  • File

Version History

Version 1 (earliest) Created 16th Apr 2024 at 22:42 by Daniel Clarke

Initial commit


Open master 7d7d4ea
help Creators and Submitter
Creators
Not specified
Submitter
Activity

Views: 305

Created: 16th Apr 2024 at 22:42

Last updated: 23rd Apr 2024 at 16:54

help Tags

This item has not yet been tagged.

help Attributions

None

Total size: 122 KB
Powered by
(v.1.14.1)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH