Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
2.
Cell ; 184(8): 2239-2254.e39, 2021 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-33831375

RESUMO

Intra-tumor heterogeneity (ITH) is a mechanism of therapeutic resistance and therefore an important clinical challenge. However, the extent, origin, and drivers of ITH across cancer types are poorly understood. To address this, we extensively characterize ITH across whole-genome sequences of 2,658 cancer samples spanning 38 cancer types. Nearly all informative samples (95.1%) contain evidence of distinct subclonal expansions with frequent branching relationships between subclones. We observe positive selection of subclonal driver mutations across most cancer types and identify cancer type-specific subclonal patterns of driver gene mutations, fusions, structural variants, and copy number alterations as well as dynamic changes in mutational processes between subclonal expansions. Our results underline the importance of ITH and its drivers in tumor evolution and provide a pan-cancer resource of comprehensively annotated subclonal events from whole-genome sequencing data.


Assuntos
Heterogeneidade Genética , Neoplasias/genética , Variações do Número de Cópias de DNA , DNA de Neoplasias/química , DNA de Neoplasias/metabolismo , Bases de Dados Genéticas , Resistencia a Medicamentos Antineoplásicos/genética , Humanos , Neoplasias/patologia , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma
3.
Science ; 369(6503): 561-565, 2020 07 31.
Artigo em Inglês | MEDLINE | ID: mdl-32732423

RESUMO

Most neuropsychiatric disease risk variants are in noncoding sequences and lack functional interpretation. Because regulatory sequences often reside in open chromatin, we reasoned that neuropsychiatric disease risk variants may affect chromatin accessibility during neurodevelopment. Using human induced pluripotent stem cell (iPSC)-derived neurons that model developing brains, we identified thousands of genetic variants exhibiting allele-specific open chromatin (ASoC). These neuronal ASoCs were partially driven by altered transcription factor binding, overrepresented in brain gene enhancers and expression quantitative trait loci, and frequently associated with distal genes through chromatin contacts. ASoCs were enriched for genetic variants associated with brain disorders, enabling identification of functional schizophrenia risk variants and their cis-target genes. This study highlights ASoC as a functional mechanism of noncoding neuropsychiatric risk variants, providing a powerful framework for identifying disease causal variants and genes.


Assuntos
Alelos , Encéfalo/metabolismo , Cromatina/metabolismo , Células-Tronco Pluripotentes Induzidas/metabolismo , Esquizofrenia/genética , Elementos Facilitadores Genéticos , Humanos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Risco
4.
Nature ; 578(7793): 122-128, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-32025013

RESUMO

Cancer develops through a process of somatic evolution1,2. Sequencing data from a single biopsy represent a snapshot of this process that can reveal the timing of specific genomic aberrations and the changing influence of mutational processes3. Here, by whole-genome sequencing analysis of 2,658 cancers as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA)4, we reconstruct the life history and evolution of mutational processes and driver mutation sequences of 38 types of cancer. Early oncogenesis is characterized by mutations in a constrained set of driver genes, and specific copy number gains, such as trisomy 7 in glioblastoma and isochromosome 17q in medulloblastoma. The mutational spectrum changes significantly throughout tumour evolution in 40% of samples. A nearly fourfold diversification of driver genes and increased genomic instability are features of later stages. Copy number alterations often occur in mitotic crises, and lead to simultaneous gains of chromosomal segments. Timing analyses suggest that driver mutations often precede diagnosis by many years, if not decades. Together, these results determine the evolutionary trajectories of cancer, and highlight opportunities for early cancer detection.


Assuntos
Evolução Molecular , Genoma Humano/genética , Neoplasias/genética , Reparo do DNA/genética , Dosagem de Genes , Genes Supressores de Tumor , Variação Genética , Humanos , Mutagênese Insercional/genética
5.
JCO Clin Cancer Inform ; 3: 1-9, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30730765

RESUMO

PURPOSE: Recent data suggest that imaging radiomic features of a tumor could be indicative of important genomic biomarkers. Understanding the relationship between radiomic and genomic features is important for basic cancer research and future patient care. We performed a comprehensive study to discover the imaginggenomic associations in head and neck squamous cell carcinoma (HNSCC) and explore the potential of predicting tumor genomic alternations using radiomic features. METHODS: Our retrospective study integrated whole-genome multiomics data from The Cancer Genome Atlas with matched computed tomography imaging data from The Cancer Imaging Archive for the same set of 126 patients with HNSCC. Linear regression and gene set enrichment analysis were used to identify statistically significant associations between radiomic imaging and genomic features. Random forest classifier was used to predict the status of two key HNSCC molecular biomarkers, human papillomavirus and disruptive TP53 mutation, on the basis of radiomic features. RESULTS: Widespread and statistically significant associations were discovered between genomic features (including microRNA expression, somatic mutations, and transcriptional activity, copy number variations, and promoter region DNA methylation changes of pathways) and radiomic features characterizing the size, shape, and texture of tumor. Prediction of human papillomavirus and TP53 mutation status using radiomic features achieved areas under the receiver operating characteristic curve of 0.71 and 0.641, respectively. CONCLUSION: Our exploratory study suggests that radiomic features are associated with genomic characteristics at multiple molecular layers in HNSCC and provides justification for continued development of radiomics as biomarkers for relevant genomic alterations in HNSCC.


Assuntos
Biomarcadores Tumorais , Diagnóstico por Imagem , Predisposição Genética para Doença , Genômica , Processamento de Imagem Assistida por Computador , Carcinoma de Células Escamosas de Cabeça e Pescoço/diagnóstico por imagem , Carcinoma de Células Escamosas de Cabeça e Pescoço/genética , Idoso , Biologia Computacional/métodos , Variações do Número de Cópias de DNA , Feminino , Perfilação da Expressão Gênica , Genômica/métodos , Humanos , Interpretação de Imagem Assistida por Computador , Masculino , Pessoa de Meia-Idade , Mutação , Estadiamento de Neoplasias , Reprodutibilidade dos Testes , Estudos Retrospectivos , Carcinoma de Células Escamosas de Cabeça e Pescoço/patologia , Tomografia Computadorizada por Raios X , Fluxo de Trabalho
6.
G3 (Bethesda) ; 7(7): 2161-2170, 2017 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-28526729

RESUMO

High-throughput sequencing (HTS) of reduced representation genomic libraries has ushered in an era of genotyping-by-sequencing (GBS), where genome-wide genotype data can be obtained for nearly any species. However, there remains a need for imputation-free GBS methods for genotyping large samples taken from heterogeneous populations of heterozygous individuals. This requires that a number of issues encountered with GBS be considered, including the sequencing of nonoverlapping sets of loci across multiple GBS libraries, a common missing data problem that results in low call rates for markers per individual, and a tendency for applicability only in inbred line samples with sufficient linkage disequilibrium for accurate imputation. We addressed these issues while developing and validating a new, comprehensive platform for GBS. This study supports the notion that GBS can be tailored to particular aims, and using Zea mays our results indicate that large samples of unknown pedigree can be genotyped to obtain complete and accurate GBS data. Optimizing size selection to sequence a high proportion of shared loci among individuals in different libraries and using simple in silico filters, a GBS procedure was established that produces high call rates per marker (>85%) with accuracy exceeding 99.4%. Furthermore, by capitalizing on the sequence-read structure of GBS data (stacks of reads), a new tool for resolving local haplotypes and scoring phased genotypes was developed, a feature that is not available in many GBS pipelines. Using local haplotypes reduces the marker dimensionality of the genotype matrix while increasing the informativeness of the data. Phased GBS in maize also revealed the existence of reproducibly inaccurate (apparent accuracy) genotypes that were due to divergent copy number variants (CNVs) unobservable in the underlying single nucleotide polymorphism (SNP) data.


Assuntos
Dosagem de Genes , Loci Gênicos , Variação Genética , Desequilíbrio de Ligação , Zea mays/genética , Estudo de Associação Genômica Ampla
7.
Pac Symp Biocomput ; 21: 393-404, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26776203

RESUMO

We present a feature allocation model to reconstruct tumor subclones based on mutation pairs. The key innovation lies in the use of a pair of proximal single nucleotide variants (SNVs) for the subclone reconstruction as opposed to a single SNV. Using the categorical extension of the Indian buffet process (cIBP) we define the subclones as a vector of categorical matrices corresponding to a set of mutation pairs. Through Bayesian inference we report posterior probabilities of the number, genotypes and population frequencies of subclones in one or more tumor sample. We demonstrate the proposed methods using simulated and real-world data. A free software package is available at http://www.compgenome.org/pairclone.


Assuntos
Teorema de Bayes , Mutação , Neoplasias/genética , Estatísticas não Paramétricas , Algoritmos , Biologia Computacional/métodos , Biologia Computacional/estatística & dados numéricos , Simulação por Computador , Neoplasias de Cabeça e Pescoço/genética , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Cadeias de Markov , Modelos Genéticos , Método de Monte Carlo , Polimorfismo de Nucleotídeo Único , Software
8.
Nucleic Acids Res ; 44(3): e25, 2016 Feb 18.
Artigo em Inglês | MEDLINE | ID: mdl-26420835

RESUMO

Somatic mosaicism refers to the existence of somatic mutations in a fraction of somatic cells in a single biological sample. Its importance has mainly been discussed in theory although experimental work has started to emerge linking somatic mosaicism to disease diagnosis. Through novel statistical modeling of paired-end DNA-sequencing data using blood-derived DNA from healthy donors as well as DNA from tumor samples, we present an ultra-fast computational pipeline, LocHap that searches for multiple single nucleotide variants (SNVs) that are scaffolded by the same reads. We refer to scaffolded SNVs as local haplotypes (LH). When an LH exhibits more than two genotypes, we call it a local haplotype variant (LHV). The presence of LHVs is considered evidence of somatic mosaicism because a genetically homogeneous cell population will not harbor LHVs. Applying LocHap to whole-genome and whole-exome sequence data in DNA from normal blood and tumor samples, we find wide-spread LHVs across the genome. Importantly, we find more LHVs in tumor samples than in normal samples, and more in older adults than in younger ones. We confirm the existence of LHVs and somatic mosaicism by validation studies in normal blood samples. LocHap is publicly available at http://www.compgenome.org/lochap.


Assuntos
Haplótipos , Mosaicismo , Neoplasias/sangue , Análise de Sequência de DNA/métodos , Algoritmos , Estudos de Casos e Controles , Humanos , Polimorfismo de Nucleotídeo Único
9.
J R Stat Soc Ser C Appl Stat ; 65(4): 547-563, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-28461708

RESUMO

Tumor samples are heterogeneous. They consist of different subclones that are characterized by differences in DNA nucleotide sequences and copy numbers on multiple loci. Heterogeneity can be measured through the identification of the subclonal copy number and sequence at a selected set of loci. Understanding that the accurate identification of variant allele fractions greatly depends on a precise determination of copy numbers, we develop a Bayesian feature allocation model for jointly calling subclonal copy numbers and the corresponding allele sequences for the same loci. The proposed method utilizes three random matrices, L , Z and w to represent subclonal copy numbers ( L ), numbers of subclonal variant alleles ( Z ) and cellular fractions of subclones in samples ( w ), respectively. The unknown number of subclones implies a random number of columns for these matrices. We use next-generation sequencing data to estimate the subclonal structures through inference on these three matrices. Using simulation studies and a real data analysis, we demonstrate how posterior inference on the subclonal structure is enhanced with the joint modeling of both structure and sequencing variants on subclonal genomes. Software is available at http://compgenome.org/BayClone2.

10.
J Natl Cancer Inst ; 107(8)2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25956356

RESUMO

BACKGROUND: Genetic interactions play a critical role in cancer development. Existing knowledge about cancer genetic interactions is incomplete, especially lacking evidences derived from large-scale cancer genomics data. The Cancer Genome Atlas (TCGA) produces multimodal measurements across genomics and features of thousands of tumors, which provide an unprecedented opportunity to investigate the interplays of genes in cancer. METHODS: We introduce Zodiac, a computational tool and resource to integrate existing knowledge about cancer genetic interactions with new information contained in TCGA data. It is an evolution of existing knowledge by treating it as a prior graph, integrating it with a likelihood model derived by Bayesian graphical model based on TCGA data, and producing a posterior graph as updated and data-enhanced knowledge. In short, Zodiac realizes "Prior interaction map + TCGA data → Posterior interaction map." RESULTS: Zodiac provides molecular interactions for about 200 million pairs of genes. All the results are generated from a big-data analysis and organized into a comprehensive database allowing customized search. In addition, Zodiac provides data processing and analysis tools that allow users to customize the prior networks and update the genetic pathways of their interest. Zodiac is publicly available at www.compgenome.org/ZODIAC. CONCLUSIONS: Zodiac recapitulates and extends existing knowledge of molecular interactions in cancer. It can be used to explore novel gene-gene interactions, transcriptional regulation, and other types of molecular interplays in cancer.


Assuntos
Bases de Dados Genéticas , Epistasia Genética , Genômica , Neoplasias/genética , Software , Teorema de Bayes , Bases de Dados Genéticas/tendências , Genômica/métodos , Humanos , Internet , Funções Verossimilhança , Interface Usuário-Computador
11.
Pac Symp Biocomput ; : 467-78, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25592605

RESUMO

In this paper, we present a novel feature allocation model to describe tumor heterogeneity (TH) using next-generation sequencing (NGS) data. Taking a Bayesian approach, we extend the Indian buffet process (IBP) to define a class of nonparametric models, the categorical IBP (cIBP). A cIBP takes categorical values to denote homozygous or heterozygous genotypes at each SNV. We define a subclone as a vector of these categorical values, each corresponding to an SNV. Instead of partitioning somatic mutations into non-overlapping clusters with similar cellular prevalences, we took a different approach using feature allocation. Importantly, we do not assume somatic mutations with similar cellular prevalence must be from the same subclone and allow overlapping mutations shared across subclones. We argue that this is closer to the underlying theory of phylogenetic clonal expansion, as somatic mutations occurred in parent subclones should be shared across the parent and child subclones. Bayesian inference yields posterior probabilities of the number, genotypes, and proportions of subclones in a tumor sample, thereby providing point estimates as well as variabilities of the estimates for each subclone. We report results on both simulated and real data. BayClone is available at http://health.bsd.uchicago.edu/yji/soft.html.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Modelos Estatísticos , Neoplasias/genética , Software , Teorema de Bayes , Biologia Computacional , Simulação por Computador , Humanos , Funções Verossimilhança , Neoplasias Pulmonares/genética , Cadeias de Markov , Método de Monte Carlo , Mutação , Polimorfismo de Nucleotídeo Único , Estatísticas não Paramétricas
12.
Stat Med ; 32(22): 3899-910, 2013 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-23553747

RESUMO

Mixed-effects models have recently become popular for analyzing sparse longitudinal data that arise naturally in biological, agricultural and biomedical studies. Traditional approaches assume independent residuals over time and explain the longitudinal dependence by random effects. However, when bivariate or multivariate traits are measured longitudinally, this fundamental assumption is likely to be violated because of intertrait dependence over time. We provide a more general framework where the dependence of the observations from the same subject over time is not assumed to be explained completely by the random effects of the model. We propose a novel, mixed model-based approach and estimate the error-covariance structure nonparametrically under a generalized linear model framework. We use penalized splines to model the general effect of time, and we consider a Dirichlet process mixture of normal prior for the random-effects distribution. We analyze blood pressure data from the Framingham Heart Study where body mass index, gender and time are treated as covariates. We compare our method with traditional methods including parametric modeling of the random effects and independent residual errors over time. We conduct extensive simulation studies to investigate the practical usefulness of the proposed method. The current approach is very helpful in analyzing bivariate irregular longitudinal traits.


Assuntos
Teorema de Bayes , Estudos Longitudinais/métodos , Modelos Estatísticos , Análise Multivariada , Adulto , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Pressão Sanguínea , Índice de Massa Corporal , Simulação por Computador , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Fatores Sexuais
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...