Search | VHL Regional Portal

1.

Machine learning multi-omics analysis reveals cancer driver dysregulation in pan-cancer cell lines compared to primary tumors.

Sanders, Lauren M; Chandra, Rahul; Zebarjadi, Navid; Beale, Holly C; Lyle, A Geoffrey; Rodriguez, Analiz; Kephart, Ellen Towle; Pfeil, Jacob; Cheney, Allison; Learned, Katrina; Currie, Rob; Gitlin, Leonid; Vengerov, David; Haussler, David; Salama, Sofie R; Vaske, Olena M.

Commun Biol ; 5(1): 1367, 2022 12 13.

Article in English | MEDLINE | ID: mdl-36513728

ABSTRACT

Cancer cell lines have been widely used for decades to study biological processes driving cancer development, and to identify biomarkers of response to therapeutic agents. Advances in genomic sequencing have made possible large-scale genomic characterizations of collections of cancer cell lines and primary tumors, such as the Cancer Cell Line Encyclopedia (CCLE) and The Cancer Genome Atlas (TCGA). These studies allow for the first time a comprehensive evaluation of the comparability of cancer cell lines and primary tumors on the genomic and proteomic level. Here we employ bulk mRNA and micro-RNA sequencing data from thousands of samples in CCLE and TCGA, and proteomic data from partner studies in the MD Anderson Cell Line Project (MCLP) and The Cancer Proteome Atlas (TCPA), to characterize the extent to which cancer cell lines recapitulate tumors. We identify dysregulation of a long non-coding RNA and microRNA regulatory network in cancer cell lines, associated with differential expression between cell lines and primary tumors in four key cancer driver pathways: KRAS signaling, NFKB signaling, IL2/STAT5 signaling and TP53 signaling. Our results emphasize the necessity for careful interpretation of cancer cell line experiments, particularly with respect to therapeutic treatments targeting these important cancer pathways.

Subject(s)

Neoplasms , Proteomics , Humans , Multiomics , Neoplasms/genetics , Neoplasms/metabolism , Machine Learning , Cell Line

2.

A Functional Precision Medicine Pipeline Combines Comparative Transcriptomics and Tumor Organoid Modeling to Identify Bespoke Treatment Strategies for Glioblastoma.

Reed, Megan R; Lyle, A Geoffrey; De Loose, Annick; Maddukuri, Leena; Learned, Katrina; Beale, Holly C; Kephart, Ellen T; Cheney, Allison; van den Bout, Anouk; Lee, Madison P; Hundley, Kelsey N; Smith, Ashley M; DesRochers, Teresa M; Vibat, Cecile Rose T; Gokden, Murat; Salama, Sofie; Wardell, Christopher P; Eoff, Robert L; Vaske, Olena M; Rodriguez, Analiz.

Cells ; 10(12)2021 12 02.

Article in English | MEDLINE | ID: mdl-34943910

ABSTRACT

Li Fraumeni syndrome (LFS) is a hereditary cancer predisposition syndrome caused by germline mutations in TP53. TP53 is the most common mutated gene in human cancer, occurring in 30-50% of glioblastomas (GBM). Here, we highlight a precision medicine platform to identify potential targets for a GBM patient with LFS. We used a comparative transcriptomics approach to identify genes that are uniquely overexpressed in the LFS GBM patient relative to a cancer compendium of 12,747 tumor RNA sequencing data sets, including 200 GBMs. STAT1 and STAT2 were identified as being significantly overexpressed in the LFS patient, indicating ruxolitinib, a Janus kinase 1 and 2 inhibitors, as a potential therapy. The LFS patient had the highest level of STAT1 and STAT2 expression in an institutional high-grade glioma cohort of 45 patients, further supporting the cancer compendium results. To empirically validate the comparative transcriptomics pipeline, we used a combination of adherent and organoid cell culture techniques, including ex vivo patient-derived organoids (PDOs) from four patient-derived cell lines, including the LFS patient. STAT1 and STAT2 expression levels in the four patient-derived cells correlated with levels identified in the respective parent tumors. In both adherent and organoid cultures, cells from the LFS patient were among the most sensitive to ruxolitinib compared to patient-derived cells with lower STAT1 and STAT2 expression levels. A spheroid-based drug screening assay (3D-PREDICT) was performed and used to identify further therapeutic targets. Two targeted therapies were selected for the patient of interest and resulted in radiographic disease stability. This manuscript supports the use of comparative transcriptomics to identify personalized therapeutic targets in a functional precision medicine platform for malignant brain tumors.

Subject(s)

Glioblastoma/genetics , Li-Fraumeni Syndrome/genetics , STAT1 Transcription Factor/genetics , STAT2 Transcription Factor/genetics , Adolescent , Adult , Child , Female , Gene Expression Regulation, Neoplastic , Germ-Line Mutation/genetics , Glioblastoma/complications , Glioblastoma/pathology , Humans , Janus Kinase 1/antagonists & inhibitors , Janus Kinase 1/genetics , Janus Kinase 2/antagonists & inhibitors , Janus Kinase 2/genetics , Li-Fraumeni Syndrome/complications , Li-Fraumeni Syndrome/pathology , Male , Nitriles/pharmacology , Organoids/metabolism , Precision Medicine , Pyrazoles/pharmacology , Pyrimidines/pharmacology , RNA-Seq , Transcriptome/genetics , Young Adult

3.

A method for campus-wide SARS-CoV-2 surveillance at a large public university.

Chang, Terren; Draper, Jolene M; Van den Bout, Anouk; Kephart, Ellen; Maul-Newby, Hannah; Vasquez, Yvonne; Woodbury, Jason; Randi, Savanna; Pedersen, Martina; Nave, Maeve; La, Scott; Gallagher, Natalie; McCabe, Molly M; Dhillon, Namrita; Bjork, Isabel; Luttrell, Michael; Dang, Frank; MacMillan, John B; Green, Ralph; Miller, Elizabeth; Kilpatrick, Auston M; Vaske, Olena; Stone, Michael D; Sanford, Jeremy R.

PLoS One ; 16(12): e0261230, 2021.

Article in English | MEDLINE | ID: mdl-34919584

ABSTRACT

The systematic screening of asymptomatic and pre-symptomatic individuals is a powerful tool for controlling community transmission of infectious disease on college campuses. Faced with a paucity of testing in the beginning of the COVID-19 pandemic, many universities developed molecular diagnostic laboratories focused on SARS-CoV-2 diagnostic testing on campus and in their broader communities. We established the UC Santa Cruz Molecular Diagnostic Lab in early April 2020 and began testing clinical samples just five weeks later. Using a clinically-validated laboratory developed test (LDT) that avoided supply chain constraints, an automated sample pooling and processing workflow, and a custom laboratory information management system (LIMS), we expanded testing from a handful of clinical samples per day to thousands per day with the testing capacity to screen our entire campus population twice per week. In this report we describe the technical, logistical, and regulatory processes that enabled our pop-up lab to scale testing and reporting capacity to thousands of tests per day.

Subject(s)

COVID-19 Nucleic Acid Testing/methods , COVID-19/diagnosis , Clinical Laboratory Techniques/methods , Diagnostic Tests, Routine/methods , Mass Screening/methods , Pandemics/prevention & control , Diagnostic Screening Programs , Humans , Universities

4.

The case for using mapped exonic non-duplicate reads when reporting RNA-sequencing depth: examples from pediatric cancer datasets.

Beale, Holly C; Roger, Jacquelyn M; Cattle, Matthew A; McKay, Liam T; Thompson, Drew K A; Learned, Katrina; Lyle, A Geoffrey; Kephart, Ellen T; Currie, Rob; Lam, Du Linh; Sanders, Lauren; Pfeil, Jacob; Vivian, John; Bjork, Isabel; Salama, Sofie R; Haussler, David; Vaske, Olena M.

Gigascience ; 10(3)2021 03 13.

Article in English | MEDLINE | ID: mdl-33712853

ABSTRACT

BACKGROUND: The reproducibility of gene expression measured by RNA sequencing (RNA-Seq) is dependent on the sequencing depth. While unmapped or non-exonic reads do not contribute to gene expression quantification, duplicate reads contribute to the quantification but are not informative for reproducibility. We show that mapped, exonic, non-duplicate (MEND) reads are a useful measure of reproducibility of RNA-Seq datasets used for gene expression analysis. FINDINGS: In bulk RNA-Seq datasets from 2,179 tumors in 48 cohorts, the fraction of reads that contribute to the reproducibility of gene expression analysis varies greatly. Unmapped reads constitute 1-77% of all reads (median [IQR], 3% [3-6%]); duplicate reads constitute 3-100% of mapped reads (median [IQR], 27% [13-43%]); and non-exonic reads constitute 4-97% of mapped, non-duplicate reads (median [IQR], 25% [16-37%]). MEND reads constitute 0-79% of total reads (median [IQR], 50% [30-61%]). CONCLUSIONS: Because not all reads in an RNA-Seq dataset are informative for reproducibility of gene expression measurements and the fraction of reads that are informative varies, we propose reporting a dataset's sequencing depth in MEND reads, which definitively inform the reproducibility of gene expression, rather than total, mapped, or exonic reads. We provide a Docker image containing (i) the existing required tools (RSeQC, sambamba, and samblaster) and (ii) a custom script to calculate MEND reads from RNA-Seq data files. We recommend that all RNA-Seq gene expression experiments, sensitivity studies, and depth recommendations use MEND units for sequencing depth.

Subject(s)

Neoplasms , RNA , Child , Gene Expression Profiling , High-Throughput Nucleotide Sequencing , Humans , Neoplasms/genetics , Reproducibility of Results , Sequence Analysis, RNA , Exome Sequencing

5.

Identification of a differentiation stall in epithelial mesenchymal transition in histone H3-mutant diffuse midline glioma.

Sanders, Lauren M; Cheney, Allison; Seninge, Lucas; van den Bout, Anouk; Chen, Marissa; Beale, Holly C; Kephart, Ellen Towle; Pfeil, Jacob; Learned, Katrina; Lyle, A Geoffrey; Bjork, Isabel; Haussler, David; Salama, Sofie R; Vaske, Olena M.

Gigascience ; 9(12)2020 12 15.

Article in English | MEDLINE | ID: mdl-33319914

ABSTRACT

BACKGROUND: Diffuse midline gliomas with histone H3 K27M (H3K27M) mutations occur in early childhood and are marked by an invasive phenotype and global decrease in H3K27me3, an epigenetic mark that regulates differentiation and development. H3K27M mutation timing and effect on early embryonic brain development are not fully characterized. RESULTS: We analyzed multiple publicly available RNA sequencing datasets to identify differentially expressed genes between H3K27M and non-K27M pediatric gliomas. We found that genes involved in the epithelial-mesenchymal transition (EMT) were significantly overrepresented among differentially expressed genes. Overall, the expression of pre-EMT genes was increased in the H3K27M tumors as compared to non-K27M tumors, while the expression of post-EMT genes was decreased. We hypothesized that H3K27M may contribute to gliomagenesis by stalling an EMT required for early brain development, and evaluated this hypothesis by using another publicly available dataset of single-cell and bulk RNA sequencing data from developing cerebral organoids. This analysis revealed similarities between H3K27M tumors and pre-EMT normal brain cells. Finally, a previously published single-cell RNA sequencing dataset of H3K27M and non-K27M gliomas revealed subgroups of cells at different stages of EMT. In particular, H3.1K27M tumors resemble a later EMT stage compared to H3.3K27M tumors. CONCLUSIONS: Our data analyses indicate that this mutation may be associated with a differentiation stall evident from the failure to proceed through the EMT-like developmental processes, and that H3K27M cells preferentially exist in a pre-EMT cell phenotype. This study demonstrates how novel biological insights could be derived from combined analysis of several previously published datasets, highlighting the importance of making genomic data available to the community in a timely manner.

Subject(s)

Glioma , Histones , Cell Differentiation/genetics , Child , Child, Preschool , Epithelial-Mesenchymal Transition/genetics , Glioma/genetics , Histones/genetics , Humans , Mutation

6.

Comparative RNA-seq analysis aids in diagnosis of a rare pediatric tumor.

Sanders, Lauren M; Rangaswami, Arun; Bjork, Isabel; Lam, Du Linh; Beale, Holly C; Kephart, Ellen Towle; Durbin, Ann; Learned, Katrina; Currie, Rob; Lyle, A Geoffrey; Pfeil, Jacob; Shah, Avanthi Tayi; Lee, Alex G; Leung, Stanley G; Behroozfard, Inge H; Breese, Marcus R; Peralez, Jennifer; Hazard, Florette K; Lacayo, Norman; Spunt, Sheri L; Haussler, David; Salama, Sofie R; Sweet-Cordero, E Alejandro; Vaske, Olena M.

Cold Spring Harb Mol Case Stud ; 5(5)2019 10.

Article in English | MEDLINE | ID: mdl-31645344

ABSTRACT

Gliomatosis peritonei is a rare pathologic finding that is associated with ovarian teratomas and malignant mixed germ cell tumors. The occurrence of gliomatosis as a mature glial implant can impart an improved prognosis to patients with immature ovarian teratoma, making prompt and accurate diagnosis important. We describe a case of recurrent immature teratoma in a 10-yr-old female patient, in which comparative analysis of the RNA sequencing gene expression data from the patient's tumor was used effectively to aid in the diagnosis of gliomatosis peritonei.

Subject(s)

Peritoneal Neoplasms/diagnosis , Peritoneal Neoplasms/genetics , Teratoma/diagnosis , Base Sequence/genetics , Child , Female , Glioma/diagnosis , Glioma/genetics , Humans , Ovarian Neoplasms/diagnosis , Ovarian Neoplasms/genetics , Prognosis , RNA-Seq/methods , Rare Diseases/diagnosis , Rare Diseases/genetics , Sequence Analysis, RNA/methods , Teratoma/genetics , Exome Sequencing

7.

Comparative Tumor RNA Sequencing Analysis for Difficult-to-Treat Pediatric and Young Adult Patients With Cancer.

Vaske, Olena M; Bjork, Isabel; Salama, Sofie R; Beale, Holly; Tayi Shah, Avanthi; Sanders, Lauren; Pfeil, Jacob; Lam, Du L; Learned, Katrina; Durbin, Ann; Kephart, Ellen T; Currie, Rob; Newton, Yulia; Swatloski, Teresa; McColl, Duncan; Vivian, John; Zhu, Jingchun; Lee, Alex G; Leung, Stanley G; Spillinger, Aviv; Liu, Heng-Yi; Liang, Winnie S; Byron, Sara A; Berens, Michael E; Resnick, Adam C; Lacayo, Norman; Spunt, Sheri L; Rangaswami, Arun; Huynh, Van; Torno, Lilibeth; Plant, Ashley; Kirov, Ivan; Zabokrtsky, Keri B; Rassekh, S Rod; Deyell, Rebecca J; Laskin, Janessa; Marra, Marco A; Sender, Leonard S; Mueller, Sabine; Sweet-Cordero, E Alejandro; Goldstein, Theodore C; Haussler, David.

JAMA Netw Open ; 2(10): e1913968, 2019 10 02.

Article in English | MEDLINE | ID: mdl-31651965

ABSTRACT

Importance: Pediatric cancers are epigenetic diseases; therefore, considering tumor gene expression information is necessary for a complete understanding of the tumorigenic processes. Objective: To evaluate the feasibility and utility of incorporating comparative gene expression information into the precision medicine framework for difficult-to-treat pediatric and young adult patients with cancer. Design, Setting, and Participants: This cohort study was conducted as a consortium between the University of California, Santa Cruz (UCSC) Treehouse Childhood Cancer Initiative and clinical genomic trials. RNA sequencing (RNA-Seq) data were obtained from the following 4 clinical sites and analyzed at UCSC: British Columbia Children's Hospital (n = 31), Lucile Packard Children's Hospital at Stanford University (n = 80), CHOC Children's Hospital and Hyundai Cancer Institute (n = 46), and the Pacific Pediatric Neuro-Oncology Consortium (n = 24). The study dates were January 1, 2016, to March 22, 2017. Exposures: Participants underwent tumor RNA-Seq profiling as part of 4 separate clinical trials at partner hospitals. The UCSC either downloaded RNA-Seq data from a partner institution for analysis in the cloud or provided a Docker pipeline that performed the same analysis at a partner institution. The UCSC then compared each participant's tumor RNA-Seq profile with more than 11â¯000 uniformly analyzed tumor profiles from pediatric and young adult patients with cancer, downloaded from public data repositories. These comparisons were used to identify genes and pathways that are significantly overexpressed in each patient's tumor. Results of the UCSC analysis were presented to clinical partners. Main Outcomes and Measures: Feasibility of a third-party institution (UCSC Treehouse Childhood Cancer Initiative) to obtain tumor RNA-Seq data from patients, conduct comparative analysis, and present analysis results to clinicians; and proportion of patients for whom comparative tumor gene expression analysis provided useful clinical and biological information. Results: Among 144 samples from children and young adults (median age at diagnosis, 9 years; range, 0-26 years; 72 of 118 [61.0%] male [26 patients sex unknown]) with a relapsed, refractory, or rare cancer treated on precision medicine protocols, RNA-Seq-derived gene expression was potentially useful for 99 of 144 samples (68.8%) compared with DNA mutation information that was potentially useful for only 34 of 74 samples (45.9%). Conclusions and Relevance: This study's findings suggest that tumor RNA-Seq comparisons may be feasible and highlight the potential clinical utility of incorporating such comparisons into the clinical genomic interpretation framework for difficult-to-treat pediatric and young adult patients with cancer. The study also highlights for the first time to date the potential clinical utility of harmonized publicly available genomic data sets.

Subject(s)

Neoplasms/genetics , RNA, Neoplasm/analysis , Sequence Analysis, RNA , Canada , Child , Child, Preschool , Female , Gene Expression , Humans , Infant , Infant, Newborn , Male , Precision Medicine , United States , Young Adult

8.

Barriers to accessing public cancer genomic data.

Learned, Katrina; Durbin, Ann; Currie, Robert; Kephart, Ellen Towle; Beale, Holly C; Sanders, Lauren M; Pfeil, Jacob; Goldstein, Theodore C; Salama, Sofie R; Haussler, David; Vaske, Olena Morozova; Bjork, Isabel M.

Sci Data ; 6(1): 98, 2019 06 20.

Article in English | MEDLINE | ID: mdl-31222016

Subject(s)

Information Dissemination , Neoplasms/genetics , Datasets as Topic , Genomics , Humans

9.

Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE.

Trinh, Quang M; Jen, Fei-Yang Arthur; Zhou, Ziru; Chu, Kar Ming; Perry, Marc D; Kephart, Ellen T; Contrino, Sergio; Ruzanov, Peter; Stein, Lincoln D.

BMC Genomics ; 14: 494, 2013 Jul 22.

Article in English | MEDLINE | ID: mdl-23875683

ABSTRACT

BACKGROUND: Funded by the National Institutes of Health (NIH), the aim of the Model Organism ENCyclopedia of DNA Elements (modENCODE) project is to provide the biological research community with a comprehensive encyclopedia of functional genomic elements for both model organisms C. elegans (worm) and D. melanogaster (fly). With a total size of just under 10 terabytes of data collected and released to the public, one of the challenges faced by researchers is to extract biologically meaningful knowledge from this large data set. While the basic quality control, pre-processing, and analysis of the data has already been performed by members of the modENCODE consortium, many researchers will wish to reinterpret the data set using modifications and enhancements of the original protocols, or combine modENCODE data with other data sets. Unfortunately this can be a time consuming and logistically challenging proposition. RESULTS: In recognition of this challenge, the modENCODE DCC has released uniform computing resources for analyzing modENCODE data on Galaxy (https://github.com/modENCODE-DCC/Galaxy), on the public Amazon Cloud (http://aws.amazon.com), and on the private Bionimbus Cloud for genomic research (http://www.bionimbus.org). In particular, we have released Galaxy workflows for interpreting ChIP-seq data which use the same quality control (QC) and peak calling standards adopted by the modENCODE and ENCODE communities. For convenience of use, we have created Amazon and Bionimbus Cloud machine images containing Galaxy along with all the modENCODE data, software and other dependencies. CONCLUSIONS: Using these resources provides a framework for running consistent and reproducible analyses on modENCODE data, ultimately allowing researchers to use more of their time using modENCODE data, and less time moving it around.

Subject(s)

Chromatin Immunoprecipitation , Software

10.

modMine: flexible access to modENCODE data.

Contrino, Sergio; Smith, Richard N; Butano, Daniela; Carr, Adrian; Hu, Fengyuan; Lyne, Rachel; Rutherford, Kim; Kalderimis, Alex; Sullivan, Julie; Carbon, Seth; Kephart, Ellen T; Lloyd, Paul; Stinson, E O; Washington, Nicole L; Perry, Marc D; Ruzanov, Peter; Zha, Zheng; Lewis, Suzanna E; Stein, Lincoln D; Micklem, Gos.

Nucleic Acids Res ; 40(Database issue): D1082-8, 2012 Jan.

Article in English | MEDLINE | ID: mdl-22080565

ABSTRACT

In an effort to comprehensively characterize the functional elements within the genomes of the important model organisms Drosophila melanogaster and Caenorhabditis elegans, the NHGRI model organism Encyclopaedia of DNA Elements (modENCODE) consortium has generated an enormous library of genomic data along with detailed, structured information on all aspects of the experiments. The modMine database (http://intermine.modencode.org) described here has been built by the modENCODE Data Coordination Center to allow the broader research community to (i) search for and download data sets of interest among the thousands generated by modENCODE; (ii) access the data in an integrated form together with non-modENCODE data sets; and (iii) facilitate fine-grained analysis of the above data. The sophisticated search features are possible because of the collection of extensive experimental metadata by the consortium. Interfaces are provided to allow both biologists and bioinformaticians to exploit these rich modENCODE data sets now available via modMine.

Subject(s)

Caenorhabditis elegans/genetics , Databases, Genetic , Drosophila melanogaster/genetics , Animals , Gene Expression , Genome, Helminth , Genome, Insect , Genomics , Internet , User-Computer Interface

11.

The modENCODE Data Coordination Center: lessons in harvesting comprehensive experimental details.

Washington, Nicole L; Stinson, E O; Perry, Marc D; Ruzanov, Peter; Contrino, Sergio; Smith, Richard; Zha, Zheng; Lyne, Rachel; Carr, Adrian; Lloyd, Paul; Kephart, Ellen; McKay, Sheldon J; Micklem, Gos; Stein, Lincoln D; Lewis, Suzanna E.

Database (Oxford) ; 2011: bar023, 2011.

Article in English | MEDLINE | ID: mdl-21856757

ABSTRACT

The model organism Encyclopedia of DNA Elements (modENCODE) project is a National Human Genome Research Institute (NHGRI) initiative designed to characterize the genomes of Drosophila melanogaster and Caenorhabditis elegans. A Data Coordination Center (DCC) was created to collect, store and catalog modENCODE data. An effective DCC must gather, organize and provide all primary, interpreted and analyzed data, and ensure the community is supplied with the knowledge of the experimental conditions, protocols and verification checks used to generate each primary data set. We present here the design principles of the modENCODE DCC, and describe the ramifications of collecting thorough and deep metadata for describing experiments, including the use of a wiki for capturing protocol and reagent information, and the BIR-TAB specification for linking biological samples to experimental results. modENCODE data can be found at http://www.modencode.org.

Subject(s)

Databases, Genetic , Genome , Genomics/methods , Internet , Software , Animals , Caenorhabditis elegans/genetics , DNA/genetics , Drosophila melanogaster/genetics , Humans

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL