Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
Sci Rep ; 14(1): 13989, 2024 06 18.
Article in English | MEDLINE | ID: mdl-38886371

ABSTRACT

In vitro evolution and whole genome analysis has proven to be a powerful method for studying the mechanism of action of small molecules in many haploid microbes but has generally not been applied to human cell lines in part because their diploid state complicates the identification of variants that confer drug resistance. To determine if haploid human cells could be used in MOA studies, we evolved resistance to five different anticancer drugs (doxorubicin, gemcitabine, etoposide, topotecan, and paclitaxel) using a near-haploid cell line (HAP1) and then analyzed the genomes of the drug resistant clones, developing a bioinformatic pipeline that involved filtering for high frequency alleles predicted to change protein sequence, or alleles which appeared in the same gene for multiple independent selections with the same compound. Applying the filter to sequences from 28 drug resistant clones identified a set of 21 genes which was strongly enriched for known resistance genes or known drug targets (TOP1, TOP2A, DCK, WDR33, SLCO3A1). In addition, some lines carried structural variants that encompassed additional known resistance genes (ABCB1, WWOX and RRM1). Gene expression knockdown and knockout experiments of 10 validation targets showed a high degree of specificity and accuracy in our calls and demonstrates that the same drug resistance mechanisms found in diverse clinical samples can be evolved, discovered and studied in an isogenic background.


Subject(s)
Antineoplastic Agents , Drug Resistance, Neoplasm , Haploidy , Humans , Drug Resistance, Neoplasm/genetics , Antineoplastic Agents/pharmacology , Genome, Human , Whole Genome Sequencing/methods , Cell Line
2.
Database (Oxford) ; 20212021 04 29.
Article in English | MEDLINE | ID: mdl-33914028

ABSTRACT

High-quality metadata annotations for data hosted in large public repositories are essential for research reproducibility and for conducting fast, powerful and scalable meta-analyses. Currently, a majority of sequencing samples in the National Center for Biotechnology Information's Sequence Read Archive (SRA) are missing metadata across several categories. In an effort to improve the metadata coverage of these samples, we leveraged almost 44 million attribute-value pairs from SRA BioSample to train a scalable, recurrent neural network that predicts missing metadata via named entity recognition (NER). The network was first trained to classify short text phrases according to 11 metadata categories and achieved an overall accuracy and area under the receiver operating characteristic curve of 85.2% and 0.977, respectively. We then applied our classifier to predict 11 metadata categories from the longer TITLE attribute of samples, evaluating performance on a set of samples withheld from model training. Prediction accuracies were high when extracting sample Genus/Species (94.85%), Condition/Disease (95.65%) and Strain (82.03%) from TITLEs, with lower accuracies and lack of predictions for other categories highlighting multiple issues with the current metadata annotations in BioSample. These results indicate the utility of recurrent neural networks for NER-based metadata prediction and the potential for models such as the one presented here to increase metadata coverage in BioSample while minimizing the need for manual curation. Database URL: https://github.com/cartercompbio/PredictMEE.


Subject(s)
Deep Learning , Metadata , High-Throughput Nucleotide Sequencing , Reproducibility of Results , Software
3.
Proc Natl Acad Sci U S A ; 118(8)2021 02 23.
Article in English | MEDLINE | ID: mdl-33602823

ABSTRACT

Many cancers evade immune rejection by suppressing major histocompatibility class I (MHC-I) antigen processing and presentation (AgPP). Such cancers do not respond to immune checkpoint inhibitor therapies (ICIT) such as PD-1/PD-L1 [PD-(L)1] blockade. Certain chemotherapeutic drugs augment tumor control by PD-(L)1 inhibitors through potentiation of T-cell priming but whether and how chemotherapy enhances MHC-I-dependent cancer cell recognition by cytotoxic T cells (CTLs) is not entirely clear. We now show that the lysine acetyl transferases p300/CREB binding protein (CBP) control MHC-I AgPPM expression and neoantigen amounts in human cancers. Moreover, we found that two distinct DNA damaging drugs, the platinoid oxaliplatin and the topoisomerase inhibitor mitoxantrone, strongly up-regulate MHC-I AgPP in a manner dependent on activation of nuclear factor kappa B (NF-κB), p300/CBP, and other transcription factors, but independently of autocrine IFNγ signaling. Accordingly, NF-κB and p300 ablations prevent chemotherapy-induced MHC-I AgPP and abrogate rejection of low MHC-I-expressing tumors by reinvigorated CD8+ CTLs. Drugs like oxaliplatin and mitoxantrone may be used to overcome resistance to PD-(L)1 inhibitors in tumors that had "epigenetically down-regulated," but had not permanently lost MHC-I AgPP activity.


Subject(s)
Antigen Presentation/immunology , Gene Expression Regulation, Neoplastic/drug effects , Histocompatibility Antigens Class I/immunology , Immune Checkpoint Inhibitors/pharmacology , NF-kappa B/metabolism , Neoplasms/drug therapy , p300-CBP Transcription Factors/metabolism , Animals , Antineoplastic Agents/pharmacology , Apoptosis , B7-H1 Antigen/genetics , B7-H1 Antigen/metabolism , Biomarkers, Tumor/genetics , Biomarkers, Tumor/metabolism , CD8-Positive T-Lymphocytes , Cell Proliferation , Drug Therapy, Combination , Humans , Immunotherapy/methods , Mice , NF-kappa B/genetics , Neoplasms/immunology , Neoplasms/metabolism , Neoplasms/pathology , Oxaliplatin/pharmacology , Prognosis , Survival Rate , Tumor Cells, Cultured , Xenograft Model Antitumor Assays , p300-CBP Transcription Factors/genetics
4.
Blood Cancer J ; 10(2): 16, 2020 02 06.
Article in English | MEDLINE | ID: mdl-32029705

ABSTRACT

Large-scale chromosomal translocations are frequent oncogenic drivers in acute myeloid leukemia (AML). These translocations often occur in critical transcriptional/epigenetic regulators and contribute to malignant cell growth through alteration of normal gene expression. Despite this knowledge, the specific gene expression alterations that contribute to the development of leukemia remain incompletely understood. Here, through characterization of transcriptional regulation by the RUNX1-ETO fusion protein, we have identified Ras-association domain family member 2 (RASSF2) as a critical gene that is aberrantly transcriptionally repressed in t(8;21)-associated AML. Re-expression of RASSF2 specifically inhibits t(8;21) AML development in multiple models. Through biochemical and functional studies, we demonstrate RASSF2-mediated functions to be dependent on interaction with Hippo kinases, MST1 and MST2, but independent of canonical Hippo pathway signaling. Using proximity-based biotin labeling we define the RASSF2-proximal proteome in leukemia cells and reveal association with Rac GTPase-related proteins, including an interaction with the guanine nucleotide exchange factor, DOCK2. Importantly, RASSF2 knockdown impairs Rac GTPase activation, and RASSF2 expression is broadly correlated with Rac-mediated signal transduction in AML patients. Together, these data reveal a previously unappreciated mechanistic link between RASSF2, Hippo kinases, and Rac activity with potentially broad functional consequences in leukemia.


Subject(s)
Chromosomes, Human, Pair 21/genetics , Chromosomes, Human, Pair 8/genetics , Gene Expression Regulation, Neoplastic , Leukemia, Myeloid, Acute/prevention & control , Oncogene Proteins, Fusion/metabolism , Translocation, Genetic , Tumor Suppressor Proteins/metabolism , rac GTP-Binding Proteins/metabolism , Animals , Biomarkers, Tumor/genetics , Humans , Leukemia, Myeloid, Acute/genetics , Leukemia, Myeloid, Acute/metabolism , Leukemia, Myeloid, Acute/pathology , Mice , Mice, Inbred C57BL , Oncogene Proteins, Fusion/genetics , RNA, Long Noncoding , Tumor Cells, Cultured , Tumor Suppressor Proteins/genetics , Xenograft Model Antitumor Assays , rac GTP-Binding Proteins/genetics
5.
Nutrients ; 11(9)2019 Sep 13.
Article in English | MEDLINE | ID: mdl-31540208

ABSTRACT

Anorexia nervosa (AN) is a psychiatric disorder affected by psychological, environmental, and biological factors. Individuals with AN avoid high-fat, high-calorie diets and have shown abnormal metabolism of fatty acids (FAs), which are essential for brain and cognitive/neuropsychiatric health. To clarify the relationship between FAs and AN, fasting and postprandial plasma FAs in AN patients and age-matched control women were analyzed via mass-spectrometry. Clinical phenotypes were assessed using Becker Anxiety Inventory and Becker Depression Inventory. AN patients and controls exhibited different FA signatures at both fasting and postprandial timepoints. Lauric acid, eicosapentaenoic acid (EPA), docosapentaenoic acid (DPA), and alpha-linoleic acid (ALA) were higher in AN than in controls (lauric acid: 15,081.6 ± 14,970.2 vs. 8257.4 ± 4740.2 pmol/mL; ALA at fasting: 2217.7 ± 1587.6 vs. 1087.9 ± 821.2 pmol/mL; ALA at postprandial: 1830.9 ± 1115.6 vs. 1159.4 ± 664.7 pmol/mL. EPA: 33,788.3 ± 17,487.5 vs. 22,860.6 ± 12,642.4 pmol/mL; DPA: 32,664.8 ± 16,215.0 vs. 20,969.0 ± 12,350.0 pmol/mL. FDR-adjusted p-values < 0.05). Food intake and AN status modified the correlations of FAs with body mass index (BMI), depression, and anxiety. Desaturases SCD-18 and D6D showed lower activities in AN compared to controls. Altered FA signature, specifically correlations between elevated n-3 FAs and worsened symptoms, illustrate metabolic underpinnings in AN. Future studies should investigate the mechanisms by which FA dysregulation, specifically elevated n-3 FAs, affects AN risk and outcome.


Subject(s)
Anorexia Nervosa/blood , Eating/physiology , Fatty Acids/blood , Adult , Anorexia Nervosa/psychology , Anxiety/blood , Depression/blood , Eicosapentaenoic Acid/blood , Fasting , Fatty Acid Desaturases , Fatty Acid Elongases , Fatty Acids, Omega-3/blood , Fatty Acids, Omega-6/blood , Fatty Acids, Unsaturated/blood , Female , Humans , Postprandial Period
6.
Pac Symp Biocomput ; 24: 196-207, 2019.
Article in English | MEDLINE | ID: mdl-30864322

ABSTRACT

The Sequence Read Archive (SRA) contains over one million publicly available sequencing runs from various studies using a variety of sequencing library strategies. These data inherently contain information about underlying genomic sequence variants which we exploit to extract allelic read counts on an unprecedented scale. We reprocessed over 250,000 human sequencing runs (>1000 TB data worth of raw sequence data) into a single unified dataset of allelic read counts for nearly 300,000 variants of biomedical relevance curated by NCBI dbSNP, where germline variants were detected in a median of 912 sequencing runs, and somatic variants were detected in a median of 4,876 sequencing runs, suggesting that this dataset facilitates identification of sequencing runs that harbor variants of interest. Allelic read counts obtained using a targeted alignment were very similar to read counts obtained from whole-genome alignment. Analyzing allelic read count data for matched DNA and RNA samples from tumors, we find that RNA-seq can also recover variants identified by Whole Exome Sequencing (WXS), suggesting that reprocessed allelic read counts can support variant detection across different library strategies in SRA. This study provides a rich database of known human variants across SRA samples that can support future meta-analyses of human sequence variation.


Subject(s)
Alleles , Databases, Nucleic Acid/statistics & numerical data , Genome, Human , High-Throughput Nucleotide Sequencing/statistics & numerical data , Big Data , Computational Biology , Genetic Variation , Humans , Metadata , Neoplasms/genetics , Polymorphism, Single Nucleotide , Single-Cell Analysis , Exome Sequencing/statistics & numerical data
7.
Proc Natl Acad Sci U S A ; 115(42): E9879-E9888, 2018 10 16.
Article in English | MEDLINE | ID: mdl-30287485

ABSTRACT

Cancer genomics has enabled the exhaustive molecular characterization of tumors and exposed hepatocellular carcinoma (HCC) as among the most complex cancers. This complexity is paralleled by dozens of mouse models that generate histologically similar tumors but have not been systematically validated at the molecular level. Accurate models of the molecular pathogenesis of HCC are essential for biomedical progress; therefore we compared genomic and transcriptomic profiles of four separate mouse models [MUP transgenic, TAK1-knockout, carcinogen-driven diethylnitrosamine (DEN), and Stelic Animal Model (STAM)] with those of 987 HCC patients with distinct etiologies. These four models differed substantially in their mutational load, mutational signatures, affected genes and pathways, and transcriptomes. STAM tumors were most molecularly similar to human HCC, with frequent mutations in Ctnnb1, similar pathway alterations, and high transcriptomic similarity to high-grade, proliferative human tumors with poor prognosis. In contrast, TAK1 tumors better reflected the mutational signature of human HCC and were transcriptionally similar to low-grade human tumors. DEN tumors were least similar to human disease and almost universally carried the Braf V637E mutation, which is rarely found in human HCC. Immune analysis revealed that strain-specific MHC-I genotype can influence the molecular makeup of murine tumors. Thus, different mouse models of HCC recapitulate distinct aspects of HCC biology, and their use should be adapted to specific questions based on the molecular features provided here.


Subject(s)
Biomarkers, Tumor/genetics , Carcinoma, Hepatocellular/genetics , Gene Expression Profiling , Genomics/methods , Liver Neoplasms, Experimental/genetics , Liver Neoplasms/genetics , Animals , Carcinoma, Hepatocellular/pathology , Disease Models, Animal , Humans , Liver Neoplasms/pathology , Liver Neoplasms, Experimental/pathology , Mice , Mice, Inbred C57BL , Transcriptome
8.
J Mol Biol ; 430(18 Pt A): 2875-2899, 2018 09 14.
Article in English | MEDLINE | ID: mdl-29908887

ABSTRACT

Precision cancer medicine promises to tailor clinical decisions to patients using genomic information. Indeed, successes of drugs targeting genetic alterations in tumors, such as imatinib that targets BCR-ABL in chronic myelogenous leukemia, have demonstrated the power of this approach. However, biological systems are complex, and patients may differ not only by the specific genetic alterations in their tumor, but also by more subtle interactions among such alterations. Systems biology and more specifically, network analysis, provides a framework for advancing precision medicine beyond clinical actionability of individual mutations. Here we discuss applications of network analysis to study tumor biology, early methods for N-of-1 tumor genome analysis, and the path for such tools to the clinic.


Subject(s)
Medical Oncology/statistics & numerical data , Neoplasms/epidemiology , Precision Medicine/statistics & numerical data , Algorithms , Disease Susceptibility , Genomics/methods , Humans , Medical Oncology/standards , Neoplasms/etiology , Neoplasms/metabolism , Neoplasms/therapy , Neural Networks, Computer , Precision Medicine/standards , Prognosis , Systems Biology
9.
BMC Bioinformatics ; 18(1): 286, 2017 May 31.
Article in English | MEDLINE | ID: mdl-28569140

ABSTRACT

BACKGROUND: Recently copy number variation (CNV) has gained considerable interest as a type of genomic/genetic variation that plays an important role in disease susceptibility. Advances in sequencing technology have created an opportunity for detecting CNVs more accurately. Recently whole exome sequencing (WES) has become primary strategy for sequencing patient samples and study their genomics aberrations. However, compared to whole genome sequencing, WES introduces more biases and noise that make CNV detection very challenging. Additionally, tumors' complexity makes the detection of cancer specific CNVs even more difficult. Although many CNV detection tools have been developed since introducing NGS data, there are few tools for somatic CNV detection for WES data in cancer. RESULTS: In this study, we evaluated the performance of the most recent and commonly used CNV detection tools for WES data in cancer to address their limitations and provide guidelines for developing new ones. We focused on the tools that have been designed or have the ability to detect cancer somatic aberrations. We compared the performance of the tools in terms of sensitivity and false discovery rate (FDR) using real data and simulated data. Comparative analysis of the results of the tools showed that there is a low consensus among the tools in calling CNVs. Using real data, tools show moderate sensitivity (~50% - ~80%), fair specificity (~70% - ~94%) and poor FDRs (~27% - ~60%). Also, using simulated data we observed that increasing the coverage more than 10× in exonic regions does not improve the detection power of the tools significantly. CONCLUSIONS: The limited performance of the current CNV detection tools for WES data in cancer indicates the need for developing more efficient and precise CNV detection methods. Due to the complexity of tumors and high level of noise and biases in WES data, employing advanced novel segmentation, normalization and de-noising techniques that are designed specifically for cancer data is necessary. Also, CNV detection development suffers from the lack of a gold standard for performance evaluation. Finally, developing tools with user-friendly user interfaces and visualization features can enhance CNV studies for a broader range of users.


Subject(s)
DNA Copy Number Variations , Exome , High-Throughput Nucleotide Sequencing/methods , Neoplasms/genetics , Software , Algorithms , Female , Genome, Human , Humans , Sequence Analysis, DNA/methods
10.
Ann N Y Acad Sci ; 1387(1): 73-83, 2017 01.
Article in English | MEDLINE | ID: mdl-27681358

ABSTRACT

Accessing and integrating human genomic data with phenotypes are important for biomedical research. Making genomic data accessible for research purposes, however, must be handled carefully to avoid leakage of sensitive individual information to unauthorized parties and improper use of data. In this article, we focus on data sharing within the scope of data accessibility for research. Current common practices to gain biomedical data access are strictly rule based, without a clear and quantitative measurement of the risk of privacy breaches. In addition, several types of studies require privacy-preserving linkage of genotype and phenotype information across different locations (e.g., genotypes stored in a sequencing facility and phenotypes stored in an electronic health record) to accelerate discoveries. The computer science community has developed a spectrum of techniques for data privacy and confidentiality protection, many of which have yet to be tested on real-world problems. In this article, we discuss clinical, technical, and ethical aspects of genome data privacy and confidentiality in the United States, as well as potential solutions for privacy-preserving genotype-phenotype linkage in biomedical research.


Subject(s)
Genetic Privacy , Genomics/methods , Computational Biology/ethics , Computational Biology/standards , Computational Biology/trends , Computer Security , Data Mining/ethics , Data Mining/standards , Data Mining/trends , Genetic Privacy/ethics , Genetic Privacy/legislation & jurisprudence , Genetic Privacy/standards , Genetic Privacy/trends , Genomics/ethics , Genomics/standards , Genomics/trends , Humans , Informed Consent/legislation & jurisprudence , Informed Consent/standards , Medical Record Linkage/standards , Risk Management , United States
11.
Sci Rep ; 6: 30064, 2016 07 25.
Article in English | MEDLINE | ID: mdl-27452728

ABSTRACT

Tumor infiltrating lymphocytes (TILs) have been associated with favorable prognosis in multiple tumor types. The Cancer Genome Atlas (TCGA) represents the largest collection of cancer molecular data, but lacks detailed information about the immune environment. Here, we show that exome reads mapping to the complementarity-determining-region 3 (CDR3) of mature T-cell receptor beta (TCRB) can be used as an immune DNA (iDNA) signature. Specifically, we propose a method to identify CDR3 reads in a breast tumor exome and validate it using deep TCRB sequencing. In 1,078 TCGA breast cancer exomes, the fraction of CDR3 reads was associated with TILs fraction, tumor purity, adaptive immunity gene expression signatures and improved survival in Her2+ patients. Only 2/839 TCRB clonotypes were shared between patients and none associated with a specific HLA allele or somatic driver mutations. The iDNA biomarker enriches the comprehensive dataset collected through TCGA, revealing associations with other molecular features and clinical outcomes.


Subject(s)
Breast Neoplasms/genetics , Exome/genetics , Genes, T-Cell Receptor beta/genetics , Lymphocytes, Tumor-Infiltrating/immunology , Receptors, Antigen, T-Cell, alpha-beta/genetics , T-Lymphocytes/immunology , Adaptive Immunity/genetics , Complementarity Determining Regions/genetics , Female , Gene Expression Profiling , Humans , Lymphocytes, Tumor-Infiltrating/cytology , T-Lymphocytes/cytology
12.
AMIA Annu Symp Proc ; 2016: 1747-1755, 2016.
Article in English | MEDLINE | ID: mdl-28269933

ABSTRACT

In this paper we proposed a framework: PRivacy-preserving EstiMation of Individual admiXture (PREMIX) using Intel software guard extensions (SGX). SGX is a suite of software and hardware architectures to enable efficient and secure computation over confidential data. PREMIX enables multiple sites to securely collaborate on estimating individual admixture within a secure enclave inside Intel SGX. We implemented a feature selection module to identify most discriminative Single Nucleotide Polymorphism (SNP) based on informativeness and an Expectation Maximization (EM)-based Maximum Likelihood estimator to identify the individual admixture. Experimental results based on both simulation and 1000 genome data demonstrated the efficiency and accuracy of the proposed framework. PREMIX ensures a high level of security as all operations on sensitive genomic data are conducted within a secure enclave using SGX.


Subject(s)
Genetic Privacy , Polymorphism, Single Nucleotide , Software , Algorithms , Data Anonymization , Genomics , Humans , Likelihood Functions , Racial Groups/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...