Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
1.
JCO Clin Cancer Inform ; 8: e2300207, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38427922

ABSTRACT

PURPOSE: Although immune checkpoint inhibitors (ICIs) have improved outcomes in certain patients with cancer, they can also cause life-threatening immunotoxicities. Predicting immunotoxicity risks alongside response could provide a personalized risk-benefit profile, inform therapeutic decision making, and improve clinical trial cohort selection. We aimed to build a machine learning (ML) framework using routine electronic health record (EHR) data to predict hepatitis, colitis, pneumonitis, and 1-year overall survival. METHODS: Real-world EHR data of more than 2,200 patients treated with ICI through December 31, 2018, were used to develop predictive models. Using a prediction time point of ICI initiation, a 1-year prediction time window was applied to create binary labels for the four outcomes for each patient. Feature engineering involved aggregating laboratory measurements over appropriate time windows (60-365 days). Patients were randomly partitioned into training (80%) and test (20%) sets. Random forest classifiers were developed using a rigorous model development framework. RESULTS: The patient cohort had a median age of 63 years and was 61.8% male. Patients predominantly had melanoma (37.8%), lung cancer (27.3%), or genitourinary cancer (16.4%). They were treated with PD-1 (60.4%), PD-L1 (9.0%), and CTLA-4 (19.7%) ICIs. Our models demonstrate reasonably strong performance, with AUCs of 0.739, 0.729, 0.755, and 0.752 for the pneumonitis, hepatitis, colitis, and 1-year overall survival models, respectively. Each model relies on an outcome-specific feature set, though some features are shared among models. CONCLUSION: To our knowledge, this is the first ML solution that assesses individual ICI risk-benefit profiles based predominantly on routine structured EHR data. As such, use of our ML solution will not require additional data collection or documentation in the clinic.


Subject(s)
Colitis , Hepatitis , Pneumonia , Humans , Male , Middle Aged , Female , Immune Checkpoint Inhibitors , Ambulatory Care Facilities , Pneumonia/chemically induced , Pneumonia/diagnosis
2.
Cancer Res Commun ; 4(2): 475-486, 2024 02 20.
Article in English | MEDLINE | ID: mdl-38329392

ABSTRACT

Peritoneal metastases (PM) are common in metastatic colorectal cancer (mCRC). We aimed to characterize patients with mCRC and PM from a clinical and molecular perspective using the American Association of Cancer Research Genomics Evidence Neoplasia Information Exchange (GENIE) Biopharma Collaborative (BPC) registry. Patients' tumor samples underwent targeted next-generation sequencing. Clinical characteristics and treatment outcomes were collected retrospectively. Overall survival (OS) from advanced disease and progression-free survival (PFS) from start of cancer-directed drug regimen were estimated and adjusted for the left truncation bias. A total of 1,281 patients were analyzed, 244 (19%) had PM at time of advanced disease. PM were associated with female sex [OR: 1.67; 95% confidence interval (CI): 1.11-2.54; P = 0.014] and higher histologic grade (OR: 1.72; 95% CI: 1.08-2.71; P = 0.022), while rectal primary tumors were less frequent in patients with PM (OR: 0.51; 95% CI: 0.29-0.88; P < 0.001). APC occurred less frequently in patients with PM (N = 151, 64% vs. N = 788, 79%) while MED12 alterations occurred more frequently in patients with PM (N = 20, 10% vs. N = 32, 4%); differences in MED12 were not significant when restricting to oncogenic and likely oncogenic variants according to OncoKB. Patients with PM had worse OS (HR: 1.45; 95% CI: 1.16-1.81) after adjustment for independently significant clinical and genomic predictors. PFS from initiation of first-line treatment did not differ by presence of PM. In conclusion, PM were more frequent in females and right-sided primary tumors. Differences in frequencies of MED12 and APC alterations were identified between patients with and without PM. PM were associated with shorter OS but not with PFS from first-line treatment. SIGNIFICANCE: Utilizing the GENIE BPC registry, this study found that PM in patients with colorectal cancer occur more frequently in females and right-sided primary tumors and are associated with worse OS. In addition, we found a lower frequency of APC alterations and a higher frequency in MED12 alterations in patients with PM.


Subject(s)
Antineoplastic Agents , Colonic Neoplasms , Colorectal Neoplasms , Peritoneal Neoplasms , Rectal Neoplasms , Humans , Female , Colorectal Neoplasms/genetics , Peritoneal Neoplasms/genetics , Retrospective Studies , Antineoplastic Agents/therapeutic use , Colonic Neoplasms/drug therapy , Rectal Neoplasms/drug therapy , Genomics , Registries
3.
Clin Cancer Res ; 29(17): 3418-3428, 2023 09 01.
Article in English | MEDLINE | ID: mdl-37223888

ABSTRACT

PURPOSE: We describe the clinical and genomic landscape of the non-small cell lung cancer (NSCLC) cohort of the American Association for Cancer Research (AACR) Project Genomics Evidence Neoplasia Information Exchange (GENIE) Biopharma Collaborative (BPC). EXPERIMENTAL DESIGN: A total of 1,846 patients with NSCLC whose tumors were sequenced from 2014 to 2018 at four institutions participating in AACR GENIE were randomly chosen for curation using the PRISSMM data model. Progression-free survival (PFS) and overall survival (OS) were estimated for patients treated with standard therapies. RESULTS: In this cohort, 44% of tumors harbored a targetable oncogenic alteration, with EGFR (20%), KRAS G12C (13%), and oncogenic fusions (ALK, RET, and ROS1; 5%) as the most frequent. Median OS (mOS) on first-line platinum-based therapy without immunotherapy was 17.4 months [95% confidence interval (CI), 14.9-19.5 months]. For second-line therapies, mOS was 9.2 months (95% CI, 7.5-11.3 months) for immune checkpoint inhibitors (ICI) and 6.4 months (95% CI, 5.1-8.1 months) for docetaxel ± ramucirumab. In a subset of patients treated with ICI in the second-line or later setting, median RECIST PFS (2.5 months; 95% CI, 2.2-2.8) and median real-world PFS based on imaging reports (2.2 months; 95% CI, 1.7-2.6) were similar. In exploratory analysis of the impact of tumor mutational burden (TMB) on survival on ICI treatment in the second-line or higher setting, TMB z-score harmonized across gene panels was associated with improved OS (univariable HR, 0.85; P = 0.03; n = 247 patients). CONCLUSIONS: The GENIE BPC cohort provides comprehensive clinicogenomic data for patients with NSCLC, which can improve understanding of real-world patient outcomes.


Subject(s)
Antineoplastic Agents, Immunological , Carcinoma, Non-Small-Cell Lung , Lung Neoplasms , Humans , Carcinoma, Non-Small-Cell Lung/drug therapy , Carcinoma, Non-Small-Cell Lung/genetics , Carcinoma, Non-Small-Cell Lung/pathology , Lung Neoplasms/drug therapy , Lung Neoplasms/genetics , Lung Neoplasms/pathology , Protein-Tyrosine Kinases , Antineoplastic Agents, Immunological/therapeutic use , Proto-Oncogene Proteins , Genomics
4.
JAMIA Open ; 6(1): ooad017, 2023 Apr.
Article in English | MEDLINE | ID: mdl-37012912

ABSTRACT

Objective: Automatically identifying patients at risk of immune checkpoint inhibitor (ICI)-induced colitis allows physicians to improve patientcare. However, predictive models require training data curated from electronic health records (EHR). Our objective is to automatically identify notes documenting ICI-colitis cases to accelerate data curation. Materials and Methods: We present a data pipeline to automatically identify ICI-colitis from EHR notes, accelerating chart review. The pipeline relies on BERT, a state-of-the-art natural language processing (NLP) model. The first stage of the pipeline segments long notes using keywords identified through a logistic classifier and applies BERT to identify ICI-colitis notes. The next stage uses a second BERT model tuned to identify false positive notes and remove notes that were likely positive for mentioning colitis as a side-effect. The final stage further accelerates curation by highlighting the colitis-relevant portions of notes. Specifically, we use BERT's attention scores to find high-density regions describing colitis. Results: The overall pipeline identified colitis notes with 84% precision and reduced the curator note review load by 75%. The segment BERT classifier had a high recall of 0.98, which is crucial to identify the low incidence (<10%) of colitis. Discussion: Curation from EHR notes is a burdensome task, especially when the curation topic is complicated. Methods described in this work are not only useful for ICI colitis but can also be adapted for other domains. Conclusion: Our extraction pipeline reduces manual note review load and makes EHR data more accessible for research.

5.
Sci Rep ; 12(1): 19055, 2022 11 09.
Article in English | MEDLINE | ID: mdl-36351964

ABSTRACT

Patients with non-small cell lung cancer (NSCLC) who have distant metastases have a poor prognosis. To determine which genomic factors of the primary tumor are associated with metastasis, we analyzed data from 759 patients originally diagnosed with stage I-III NSCLC as part of the AACR Project GENIE Biopharma Collaborative consortium. We found that TP53 mutations were significantly associated with the development of new distant metastases. TP53 mutations were also more prevalent in patients with a history of smoking, suggesting that these patients may be at increased risk for distant metastasis. Our results suggest that additional investigation of the optimal management of patients with early-stage NSCLC harboring TP53 mutations at diagnosis is warranted in light of their higher likelihood of developing new distant metastases.


Subject(s)
Carcinoma, Non-Small-Cell Lung , Lung Neoplasms , Humans , Carcinoma, Non-Small-Cell Lung/pathology , Genomics , Lung Neoplasms/genetics , Lung Neoplasms/pathology , Mutation , Prognosis , Tumor Suppressor Protein p53/genetics , Neoplasm Metastasis
6.
Cancer Res ; 82(21): 4058-4078, 2022 11 02.
Article in English | MEDLINE | ID: mdl-36074020

ABSTRACT

The RAS family of small GTPases represents the most commonly activated oncogenes in human cancers. To better understand the prevalence of somatic RAS mutations and the compendium of genes that are coaltered in RAS-mutant tumors, we analyzed targeted next-generation sequencing data of 607,863 mutations from 66,372 tumors in 51 cancer types in the AACR Project GENIE Registry. Bayesian hierarchical models were implemented to estimate the cancer-specific prevalence of RAS and non-RAS somatic mutations, to evaluate co-occurrence and mutual exclusivity, and to model the effects of tumor mutation burden and mutational signatures on comutation patterns. These analyses revealed differential RAS prevalence and comutations with non-RAS genes in a cancer lineage-dependent and context-dependent manner, with differences across age, sex, and ethnic groups. Allele-specific RAS co-mutational patterns included an enrichment in NTRK3 and chromatin-regulating gene mutations in KRAS G12C-mutant non-small cell lung cancer. Integrated multiomic analyses of 10,217 tumors from The Cancer Genome Atlas (TCGA) revealed distinct genotype-driven gene expression programs pointing to differential recruitment of cancer hallmarks as well as phenotypic differences and immune surveillance states in the tumor microenvironment of RAS-mutant tumors. The distinct genomic tracks discovered in RAS-mutant tumors reflected differential clinical outcomes in TCGA cohort and in an independent cohort of patients with KRAS G12C-mutant non-small cell lung cancer that received immunotherapy-containing regimens. The RAS genetic architecture points to cancer lineage-specific therapeutic vulnerabilities that can be leveraged for rationally combining RAS-mutant allele-directed therapies with targeted therapies and immunotherapy. SIGNIFICANCE: The complex genomic landscape of RAS-mutant tumors is reflective of selection processes in a cancer lineage-specific and context-dependent manner, highlighting differential therapeutic vulnerabilities that can be clinically translated.


Subject(s)
Carcinoma, Non-Small-Cell Lung , Lung Neoplasms , Humans , Carcinoma, Non-Small-Cell Lung/pathology , Lung Neoplasms/pathology , Bayes Theorem , Proto-Oncogene Proteins p21(ras)/genetics , Mutation , Genomics , Tumor Microenvironment
7.
Proc Natl Acad Sci U S A ; 119(30): e2206588119, 2022 07 26.
Article in English | MEDLINE | ID: mdl-35867821

ABSTRACT

Oncogenic mutations within the epidermal growth factor receptor (EGFR) are found in 15 to 30% of all non-small-cell lung carcinomas. The term exon 19 deletion (ex19del) is collectively used to refer to more than 20 distinct genomic alterations within exon 19 that comprise the most common EGFR mutation subtype in lung cancer. Despite this heterogeneity, clinical treatment decisions are made irrespective of which EGFR ex19del variant is present within the tumor, and there is a paucity of information regarding how individual ex19del variants influence protein structure and function. Herein, we identified allele-specific functional differences among ex19del variants attributable to recurring sequence and structure motifs. We built all-atom structural models of 60 ex19del variants identified in patients and combined molecular dynamics simulations with biochemical and biophysical experiments to analyze three ex19del mutations (E746_A750, E746_S752 > V, and L747_A750 > P). We demonstrate that sequence variation in ex19del alters oncogenic cell growth, dimerization propensity, enzyme kinetics, and tyrosine kinase inhibitor (TKI) sensitivity. We show that in contrast to E746_A750 and E746_S752 > V, the L747_A750 > P variant forms highly active ligand-independent dimers. Enzyme kinetic analysis and TKI inhibition experiments suggest that E746_S752 > V and L747_A750 > P display reduced TKI sensitivity due to decreased adenosine 5'-triphosphate Km. Through these analyses, we propose an expanded framework for interpreting ex19del variants and considerations for therapeutic intervention.


Subject(s)
Carcinoma, Non-Small-Cell Lung , ErbB Receptors , Exons , Lung Neoplasms , Alleles , Amino Acid Motifs , Carcinoma, Non-Small-Cell Lung/drug therapy , Carcinoma, Non-Small-Cell Lung/genetics , Enzyme Activation/genetics , ErbB Receptors/antagonists & inhibitors , ErbB Receptors/chemistry , ErbB Receptors/genetics , Exons/genetics , Humans , Kinetics , Lung Neoplasms/drug therapy , Lung Neoplasms/genetics , Neoplasm Recurrence, Local/genetics , Protein Kinase Inhibitors/pharmacology , Protein Kinase Inhibitors/therapeutic use , Sequence Deletion
8.
Cancer Discov ; 12(9): 2044-2057, 2022 09 02.
Article in English | MEDLINE | ID: mdl-35819403

ABSTRACT

The American Association for Cancer Research (AACR) Project Genomics Evidence Neoplasia Information Exchange (GENIE) is an international pan-cancer registry with the goal to inform cancer research and clinical care worldwide. Founded in late 2015, the milestone GENIE 9.1-public release contains data from >110,000 tumors from >100,000 people treated at 19 cancer centers from the United States, Canada, the United Kingdom, France, the Netherlands, and Spain. Here, we demonstrate the use of these real-world data, harmonized through a centralized data resource, to accurately predict enrollment on genome-guided trials, discover driver alterations in rare tumors, and identify cancer types without actionable mutations that could benefit from comprehensive genomic analysis. The extensible data infrastructure and governance framework support additional deep patient phenotyping through biopharmaceutical collaborations and expansion to include new data types such as cell-free DNA sequencing. AACR Project GENIE continues to serve a global precision medicine knowledge base of increasing impact to inform clinical decision-making and bring together cancer researchers internationally. SIGNIFICANCE: AACR Project GENIE has now accrued data from >110,000 tumors, placing it among the largest repository of publicly available, clinically annotated genomic data in the world. GENIE has emerged as a powerful resource to evaluate genome-guided clinical trial design, uncover drivers of cancer subtypes, and inform real-world use of genomic data. This article is highlighted in the In This Issue feature, p. 2007.


Subject(s)
Cell-Free Nucleic Acids , Neoplasms , Genomics , Humans , Mutation , Neoplasms/genetics , Neoplasms/pathology , Neoplasms/therapy , Precision Medicine , United States
9.
Clin Cancer Res ; 28(10): 2118-2130, 2022 05 13.
Article in English | MEDLINE | ID: mdl-35190802

ABSTRACT

PURPOSE: We wanted to determine the prognosis and the phenotypic characteristics of hormone receptor-positive advanced breast cancer tumors harboring an ERBB2 mutation in the absence of a HER2 amplification. EXPERIMENTAL DESIGN: We retrospectively collected information from the American Association of Cancer Research-Genomics Evidence Neoplasia Information Exchange registry database from patients with hormone receptor-positive, HER2-negative, ERBB2-mutated advanced breast cancer. Phenotypic and co-mutational features, as well as response to treatment and outcome were compared with matched control cases ERBB2 wild type. RESULTS: A total of 45 ERBB2-mutant cases were identified for 90 matched controls. The presence of an ERBB2 mutation was not associated with worse outcome determined by overall survival (OS) from first metastatic relapse. No significant differences were observed in phenotypic characteristics apart from higher lobular infiltrating subtype in the ERBB2-mutated group. ERBB2 mutation did not seem to have an impact in response to treatment or time-to-progression (TTP) to endocrine therapy compared with ERBB2 wild type. In the co-mutational analyses, CDH1 mutation was more frequent in the ERBB2-mutated group (FDR < 1). Although not significant, fewer co-occurring ESR1 mutations and more KRAS mutations were identified in the ERBB2-mutated group. CONCLUSIONS: ERBB2-activating mutation was not associated with a worse OS from time of first metastatic relapse, or differences in TTP on treatment as compared with a series of matched controls. Although not significant, differences in coexisting mutations (CDH1, ESR1, and KRAS) were noted between the ERBB2-mutated and the control group.


Subject(s)
Breast Neoplasms , Carcinoma, Lobular , Biomarkers, Tumor/genetics , Breast Neoplasms/genetics , Breast Neoplasms/pathology , Carcinoma, Lobular/pathology , Case-Control Studies , Female , Humans , Mutation , Proto-Oncogene Proteins p21(ras)/genetics , Receptor, ErbB-2/genetics , Receptor, ErbB-2/metabolism , Recurrence , Retrospective Studies
10.
JCO Clin Cancer Inform ; 6: e2100105, 2022 02.
Article in English | MEDLINE | ID: mdl-35192403

ABSTRACT

PURPOSE: The American Association for Cancer Research Project Genomics Evidence Neoplasia Information Exchange Biopharma Collaborative is a multi-institution effort to build a pan-cancer repository of genomic and clinical data curated from the electronic health record. For the research community to be confident that data extracted from electronic health record text are reliable, transparency of the approach used to ensure data quality is essential. MATERIALS AND METHODS: Four institutions participating in AACR's Project GENIE created an observational cohort of patients with cancer for whom tumor molecular profiling data, therapeutic exposures, and treatment outcomes are available and will be shared publicly with the research community. A comprehensive approach to quality assurance included assessments of (1) feasibility of the curation model through pressure test cases; (2) accuracy through programmatic queries and comparison with source data; and (3) reproducibility via double curation and code review. RESULTS: Assessments of feasibility resulted in critical modifications to the curation directives. Queries and comparison with source data identified errors that were rectified via data correction and curator retraining. Assessment of intercurator reliability indicated a reliable curation model. CONCLUSION: The transparent quality assurance processes for the GENIE BPC data ensure that the data can be used for analyses that support clinical decision making and advances in precision oncology.


Subject(s)
Neoplasms , Electronic Health Records , Humans , Medical Oncology , Neoplasms/diagnosis , Neoplasms/genetics , Neoplasms/therapy , Precision Medicine , Reproducibility of Results , United States
11.
JCO Clin Cancer Inform ; 5: 995-1004, 2021 09.
Article in English | MEDLINE | ID: mdl-34554823

ABSTRACT

PURPOSE: The My Cancer Genome (MCG) knowledgebase and resulting website were launched in 2011 with the purpose of guiding clinicians in the application of genomic testing results for treatment of patients with cancer. Both knowledgebase and website were originally developed using a wiki-style approach that relied on manual evidence curation and synthesis of that evidence into cancer-related biomarker, disease, and pathway pages on the website that summarized the literature for a clinical audience. This approach required significant time investment for each page, which limited website scalability as the field advanced. To address this challenge, we designed and used an assertion-based data model that allows the knowledgebase and website to expand with the field of precision oncology. METHODS: Assertions, or computationally accessible cause and effect statements, are both manually curated from primary sources and imported from external databases and stored in a knowledge management system. To generate pages for the MCG website, reusable templates transform assertions into reconfigurable text and visualizations that form the building blocks for automatically updating disease, biomarker, drug, and clinical trial pages. RESULTS: Combining text and graph templates with assertions in our knowledgebase allows generation of web pages that automatically update with our knowledgebase. Automated page generation empowers rapid scaling of the website as assertions with new biomarkers and drugs are added to the knowledgebase. This process has generated more than 9,100 clinical trial pages, 18,100 gene and alteration pages, 900 disease pages, and 2,700 drug pages to date. CONCLUSION: Leveraging both computational and manual curation processes in combination with reusable templates empowers automation and scalability for both the MCG knowledgebase and MCG website.


Subject(s)
Neoplasms , Biomarkers, Tumor/genetics , Humans , Knowledge Bases , Medical Oncology , Neoplasms/genetics , Neoplasms/therapy , Precision Medicine
12.
JAMA Netw Open ; 4(7): e2117547, 2021 07 01.
Article in English | MEDLINE | ID: mdl-34309669

ABSTRACT

Importance: Contemporary observational cancer research requires associating genomic biomarkers with reproducible end points; overall survival (OS) is a key end point, but interpretation can be challenging when multiple lines of therapy and prolonged survival are common. Progression-free survival (PFS), time to treatment discontinuation (TTD), and time to next treatment (TTNT) are alternative end points, but their utility as surrogates for OS in real-world clinicogenomic data sets has not been well characterized. Objective: To measure correlations between candidate surrogate end points and OS in a multi-institutional clinicogenomic data set. Design, Setting, and Participants: A retrospective cohort study was conducted of patients with non-small cell lung cancer (NSCLC) or colorectal cancer (CRC) whose tumors were genotyped at 4 academic centers from January 1, 2014, to December 31, 2017, and who initiated systemic therapy for advanced disease. Patients were followed up through August 31, 2020 (NSCLC), and October 31, 2020 (CRC). Statistical analyses were conducted on January 5, 2021. Exposures: Candidate surrogate end points included TTD; TTNT; PFS based on imaging reports only; PFS based on medical oncologist ascertainment only; PFS based on either imaging or medical oncologist ascertainment, whichever came first; and PFS defined by a requirement that both imaging and medical oncologist ascertainment have indicated progression. Main Outcomes and Measures: The primary outcome was the correlation between candidate surrogate end points and OS. Results: There were 1161 patients with NSCLC (648 women [55.8%]; mean [SD] age, 63 [11] years) and 1150 with CRC (647 men [56.3%]; mean [SD] age, 54 [12] years) identified for analysis. Progression-free survival based on both imaging and medical oncologist documentation was most correlated with OS (NSCLC: ρ = 0.76; 95% CI, 0.73-0.79; CRC: ρ = 0.73; 95% CI, 0.69-0.75). Time to treatment discontinuation was least associated with OS (NSCLC: ρ = 0.45; 95% CI, 0.40-0.50; CRC: ρ = 0.13; 95% CI, 0.06-0.19). Time to next treatment was modestly associated with OS (NSCLC: ρ = 0.60; 0.55-0.64; CRC: ρ = 0.39; 95% CI, 0.32-0.46). Conclusions and Relevance: This cohort study suggests that PFS based on both a radiologist and a treating oncologist determining that a progression event has occurred was the surrogate end point most highly correlated with OS for analysis of observational clinicogenomic data.


Subject(s)
Carcinoma, Non-Small-Cell Lung/mortality , Colorectal Neoplasms/mortality , Genomics/methods , Lung Neoplasms/mortality , Medical Oncology/statistics & numerical data , Aged , Biomarkers, Tumor/analysis , Female , Genotype , Humans , Male , Middle Aged , Progression-Free Survival , Radiology/statistics & numerical data , Retrospective Studies , Time-to-Treatment/statistics & numerical data , Withholding Treatment/statistics & numerical data
13.
Article in English | MEDLINE | ID: mdl-32923903

ABSTRACT

PURPOSE: Our goal was to identify the opportunities and challenges in analyzing data from the American Association of Cancer Research Project Genomics Evidence Neoplasia Information Exchange (GENIE), a multi-institutional database derived from clinically driven genomic testing, at both the inter- and the intra-institutional level. Inter-institutionally, we identified genotypic differences between primary and metastatic tumors across the 3 most represented cancers in GENIE. Intra-institutionally, we analyzed the clinical characteristics of the Vanderbilt-Ingram Cancer Center (VICC) subset of GENIE to inform the interpretation of GENIE as a whole. METHODS: We performed overall cohort matching on the basis of age, ethnicity, and sex of 13,208 patients stratified by cancer type (breast, colon, or lung) and sample site (primary or metastatic). We then determined whether detected variants, at the gene level, were associated with primary or metastatic tumors. We extracted clinical data for the VICC subset from VICC's clinical data warehouse. Treatment exposures were mapped to a 13-class schema derived from the HemOnc ontology. RESULTS: Across 756 genes, there were significant differences in all cancer types. In breast cancer, ESR1 variants were over-represented in metastatic samples (odds ratio, 5.91; q < 10-6). TP53 mutations were over-represented in metastatic samples across all cancers. VICC had a significantly different cancer type distribution than that of GENIE but patients were well matched with respect to age, sex, and sample type. Treatment data from VICC was used for a bipartite network analysis, demonstrating clusters with a mix of histologies and others being more histology specific. CONCLUSION: This article demonstrates the feasibility of deriving meaningful insights from GENIE at the inter- and intra-institutional level and illuminates the opportunities and challenges of the data GENIE contains. The results should help guide future development of GENIE, with the goal of fully realizing its potential for accelerating precision medicine.

14.
JCO Clin Cancer Inform ; 4: 691-699, 2020 08.
Article in English | MEDLINE | ID: mdl-32755461

ABSTRACT

PURPOSE: As data-sharing projects become increasingly frequent, so does the need to map data elements between multiple classification systems. A generic, robust, shareable architecture will result in increased efficiency and transparency of the mapping process, while upholding the integrity of the data. MATERIALS AND METHODS: The American Association for Cancer Research's Genomics Evidence Neoplasia Information Exchange (GENIE) collects clinical and genomic data for precision cancer medicine. As part of its commitment to open science, GENIE has partnered with the National Cancer Institute's Genomic Data Commons (GDC) as a secondary repository. After initial efforts to submit data from GENIE to GDC failed, we realized the need for a solution to allow for the iterative mapping of data elements between dynamic classification systems. We developed the Linked Entity Attribute Pair (LEAP) database framework to store and manage the term mappings used to submit data from GENIE to GDC. RESULTS: After creating and populating the LEAP framework, we identified 195 mappings from GENIE to GDC requiring remediation and observed a 28% reduction in effort to resolve these issues, as well as a reduction in inadvertent errors. These results led to a decrease in the time to map between OncoTree, the cancer type ontology used by GENIE, and International Classification of Disease for Oncology, 3rd Edition, used by GDC, from several months to less than 1 week. CONCLUSION: The LEAP framework provides a streamlined mapping process among various classification systems and allows for reusability so that efforts to create or adjust mappings are straightforward. The ability of the framework to track changes over time streamlines the process to map data elements across various dynamic classification systems.


Subject(s)
Genomics , Neoplasms , Databases, Factual , Humans , Information Dissemination , Neoplasms/genetics , Precision Medicine , United States
15.
J Am Med Inform Assoc ; 27(7): 1057-1066, 2020 07 01.
Article in English | MEDLINE | ID: mdl-32483629

ABSTRACT

OBJECTIVE: As clinical trials evolve in complexity, clinical trial data models that can capture relevant trial data in meaningful, structured annotations and computable forms are needed to support accrual. MATERIAL AND METHODS: We have developed a clinical trial information model, curation information system, and a standard operating procedure for consistent and accurate annotation of cancer clinical trials. Clinical trial documents are pulled into the curation system from publicly available sources. Using a web-based interface, a curator creates structured assertions related to disease-biomarker eligibility criteria, therapeutic context, and treatment cohorts by leveraging our data model features. These structured assertions are published on the My Cancer Genome (MCG) website. RESULTS: To date, over 5000 oncology trials have been manually curated. All trial assertion data are available for public view on the MCG website. Querying our structured knowledge base, we performed a landscape analysis to assess the top diseases, biomarker alterations, and drugs featured across all cancer trials. DISCUSSION: Beyond curating commonly captured elements, such as disease and biomarker eligibility criteria, we have expanded our model to support the curation of trial interventions and therapeutic context (ie, neoadjuvant, metastatic, etc.), and the respective biomarker-disease treatment cohorts. To the best of our knowledge, this is the first effort to capture these fields in a structured format. CONCLUSION: This paper makes a significant contribution to the field of biomedical informatics and knowledge dissemination for precision oncology via the MCG website. KEY WORDS: knowledge representation, My Cancer Genome, precision oncology, knowledge curation, cancer informatics, clinical trial data model.


Subject(s)
Clinical Trials as Topic , Data Curation , Data Mining/methods , Neoplasms/genetics , Precision Medicine , Artificial Intelligence , Biomarkers , Eligibility Determination , Genome , Humans , Internet , Natural Language Processing , Workflow
16.
Cancer Discov ; 10(4): 526-535, 2020 04.
Article in English | MEDLINE | ID: mdl-31924700

ABSTRACT

AKT inhibitors have promising activity in AKT1 E17K-mutant estrogen receptor (ER)-positive metastatic breast cancer, but the natural history of this rare genomic subtype remains unknown. Utilizing AACR Project GENIE, an international clinicogenomic data-sharing consortium, we conducted a comparative analysis of clinical outcomes of patients with matched AKT1 E17K-mutant (n = 153) and AKT1-wild-type (n = 302) metastatic breast cancer. AKT1-mutant cases had similar adjusted overall survival (OS) compared with AKT1-wild-type controls (median OS, 24.1 vs. 29.9, respectively; P = 0.98). AKT1-mutant cases enjoyed longer durations on mTOR inhibitor therapy, an observation previously unrecognized in pivotal clinical trials due to the rarity of this alteration. Other baseline clinicopathologic features, as well as durations on other classes of therapy, were broadly similar. In summary, we demonstrate the feasibility of using a novel and publicly accessible clincogenomic registry to define outcomes in a rare genomically defined cancer subtype, an approach with broad applicability to precision oncology. SIGNIFICANCE: We delineate the natural history of a rare genomically distinct cancer, AKT1 E17K-mutant ER-positive breast cancer, using a publicly accessible registry of real-world patient data, thereby illustrating the potential to inform drug registration through synthetic control data.See related commentary by Castellanos and Baxi, p. 490.


Subject(s)
Breast Neoplasms/genetics , Proto-Oncogene Proteins c-akt/metabolism , Adult , Aged , Aged, 80 and over , Breast Neoplasms/pathology , Female , Humans , Middle Aged , Mutation , Registries , Treatment Outcome
17.
JCO Clin Cancer Inform ; 2: 1-14, 2018 12.
Article in English | MEDLINE | ID: mdl-30652542

ABSTRACT

The American Association for Cancer Research (AACR) Project Genomics Evidence Neoplasia Information Exchange (GENIE) is an international data-sharing consortium focused on enabling advances in precision oncology through the gathering and sharing of tumor genetic sequencing data linked with clinical data. The project's history, operational structure, lessons learned, and institutional perspectives on participation in the data-sharing consortium are reviewed. Individuals involved with the inception and execution of AACR Project GENIE from each member institution described their experiences and lessons learned. The consortium was conceived in January 2014 and publicly released its first data set in January 2017, which consisted of 18,804 samples from 18,324 patients contributed by the eight founding institutions. Commitment and contributions from many individuals at AACR and the member institutions were crucial to the consortium's success. These individuals filled leadership, project management, informatics, data curation, contracts, ethics, and security roles. Many lessons were learned during the first 3 years of the consortium, including on how to gather, harmonize, and share data; how to make decisions and foster collaboration; and how to set the stage for continued participation and expansion of the consortium. We hope that the lessons shared here will assist new GENIE members as well as others who embark on the journey of forming a genomic data-sharing consortium.


Subject(s)
Genomics/methods , Neoplasms/genetics , Data Collection , Humans , Information Dissemination , Intersectoral Collaboration , Precision Medicine , Societies, Medical , United States
18.
Biochem Biophys Res Commun ; 475(1): 64-9, 2016 06 17.
Article in English | MEDLINE | ID: mdl-27169767

ABSTRACT

Alpha4 is a non-canonical regulatory subunit of Type 2A protein phosphatases that interacts directly with the phosphatase catalytic subunits (PP2Ac, PP4c, and PP6c) and is upregulated in a variety of cancers. Alpha4 modulates phosphatase expression levels and activity, but the molecular mechanism of this regulation is unclear, and the extent to which the various Type 2A catalytic subunits associate with Alpha4 is also unknown. To determine the relative fractions of the Type 2A catalytic subunits associated with Alpha4, we conducted Alpha4 immunodepletion experiments in HEK293T cells and found that a significant fraction of total PP6c is associated with Alpha4, whereas a minimal fraction of total PP2Ac is associated with Alpha4. To facilitate studies of phosphatases in the presence of mutant or null Alpha4 alleles, we developed a facile and rapid method to simultaneously knockdown and rescue Alpha4 in tissue culture cells. This approach has the advantage that levels of endogenous Alpha4 are dramatically reduced by shRNA expression thereby simplifying interpretation of mutant phenotypes. We used this system to show that knockdown of Alpha4 preferentially impacts the expression of PP4c and PP6c compared to expression levels of PP2Ac.


Subject(s)
Intracellular Signaling Peptides and Proteins/metabolism , Phosphoprotein Phosphatases/metabolism , Protein Phosphatase 2/metabolism , Adaptor Proteins, Signal Transducing , Catalytic Domain , Gene Knockdown Techniques , HEK293 Cells , HeLa Cells , Humans , Intracellular Signaling Peptides and Proteins/analysis , Intracellular Signaling Peptides and Proteins/genetics , Molecular Chaperones , Phosphoprotein Phosphatases/analysis , Protein Phosphatase 2/analysis
19.
J Biol Chem ; 286(20): 17665-71, 2011 May 20.
Article in English | MEDLINE | ID: mdl-21454489

ABSTRACT

Protein phosphatase 2A (PP2A) is regulated through a variety of mechanisms, including post-translational modifications and association with regulatory proteins. Alpha4 is one such regulatory protein that binds the PP2A catalytic subunit (PP2Ac) and protects it from polyubiquitination and degradation. Alpha4 is a multidomain protein with a C-terminal domain that binds Mid1, a putative E3 ubiquitin ligase, and an N-terminal domain containing the PP2Ac-binding site. In this work, we present the structure of the N-terminal domain of mammalian Alpha4 determined by x-ray crystallography and use double electron-electron resonance spectroscopy to show that it is a flexible tetratricopeptide repeat-like protein. Structurally, Alpha4 differs from its yeast homolog, Tap42, in two important ways: 1) the position of the helix containing the PP2Ac-binding residues is in a more open conformation, showing flexibility in this region; and 2) Alpha4 contains a ubiquitin-interacting motif. The effects of wild-type and mutant Alpha4 on PP2Ac ubiquitination and stability were examined in mammalian cells by performing tandem ubiquitin-binding entity precipitations and cycloheximide chase experiments. Our results reveal that both the C-terminal Mid1-binding domain and the PP2Ac-binding determinants are required for Alpha4-mediated protection of PP2Ac from polyubiquitination and degradation.


Subject(s)
Intracellular Signaling Peptides and Proteins/metabolism , Phosphoproteins/metabolism , Protein Phosphatase 2/metabolism , Ubiquitin-Protein Ligases/metabolism , Ubiquitin/metabolism , Ubiquitination/physiology , Adaptor Proteins, Signal Transducing , Amino Acid Motifs , Animals , Binding Sites , Crystallography, X-Ray , HEK293 Cells , Humans , Intercellular Signaling Peptides and Proteins , Intracellular Signaling Peptides and Proteins/chemistry , Intracellular Signaling Peptides and Proteins/genetics , Mice , Molecular Chaperones , Phosphoproteins/chemistry , Phosphoproteins/genetics , Protein Phosphatase 2/chemistry , Protein Phosphatase 2/genetics , Protein Structure, Tertiary , Ubiquitin/chemistry , Ubiquitin/genetics , Ubiquitin-Protein Ligases/chemistry , Ubiquitin-Protein Ligases/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...