Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
PLoS Comput Biol ; 14(10): e1006484, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30286077

RESUMO

Genomic regions with gene regulatory enhancer activity turnover rapidly across mammals. In contrast, gene expression patterns and transcription factor binding preferences are largely conserved between mammalian species. Based on this conservation, we hypothesized that enhancers active in different mammals would exhibit conserved sequence patterns in spite of their different genomic locations. To investigate this hypothesis, we evaluated the extent to which sequence patterns that are predictive of enhancers in one species are predictive of enhancers in other mammalian species by training and testing two types of machine learning models. We trained support vector machine (SVM) and convolutional neural network (CNN) classifiers to distinguish enhancers defined by histone marks from the genomic background based on DNA sequence patterns in human, macaque, mouse, dog, cow, and opossum. The classifiers accurately identified many adult liver, developing limb, and developing brain enhancers, and the CNNs outperformed the SVMs. Furthermore, classifiers trained in one species and tested in another performed nearly as well as classifiers trained and tested on the same species. We observed similar cross-species conservation when applying the models to human and mouse enhancers validated in transgenic assays. This indicates that many short sequence patterns predictive of enhancers are largely conserved. The sequence patterns most predictive of enhancers in each species matched the binding motifs for a common set of TFs enriched for expression in relevant tissues, supporting the biological relevance of the learned features. Thus, despite the rapid change of active enhancer locations between mammals, cross-species enhancer prediction is often possible. Our results suggest that short sequence patterns encoding enhancer activity have been maintained across more than 180 million years of mammalian evolution.


Assuntos
Sequência Conservada/genética , Elementos Facilitadores Genéticos/genética , Evolução Molecular , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Animais , Genômica , Humanos , Aprendizado de Máquina , Mamíferos , Redes Neurais de Computação , Máquina de Vetores de Suporte
2.
Pac Symp Biocomput ; 23: 424-435, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29218902

RESUMO

Genomic maps of local ancestry identify ancestry transitions - points on a chromosome where recent recombination events in admixed individuals have joined two different ancestral haplotypes. These events bring together alleles that evolved within separate continential populations, providing a unique opportunity to evaluate the joint effect of these alleles on health outcomes. In this work, we evaluate the impact of genetic variants in the context of nearby local ancestry transitions within a sample of nearly 10,000 adults of African ancestry with traits derived from electronic health records. Genetic data was located using the Metabochip, and used to derive local ancestry. We develop a model that captures the effect of both single variants and local ancestry, and use it to identify examples where local ancestry transitions significantly interact with nearby variants to influence metabolic traits. In our most compelling example, we find that the minor allele of rs16890640 occuring on a European background with a downstream local ancestry transition to African ancestry results in significantly lower mean corpuscular hemoglobin and volume. This finding represents a new way of discovering genetic interactions, and is supported by molecular data that suggest changes to local ancestry may impact local chromatin looping.


Assuntos
Epistasia Genética , Evolução Molecular , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Adulto , População Negra/genética , Cromossomos Humanos/genética , Biologia Computacional/métodos , Frequência do Gene , Genética Populacional/estatística & dados numéricos , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Haplótipos , Humanos , Modelos Lineares , Recombinação Genética , População Branca/genética
3.
Am J Hum Genet ; 99(4): 817-830, 2016 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-27640306

RESUMO

The importance of epistasis-or statistical interactions between genetic variants-to the development of complex disease in humans has been controversial. Genome-wide association studies of statistical interactions influencing human traits have recently become computationally feasible and have identified many putative interactions. However, statistical models used to detect interactions can be confounded, which makes it difficult to be certain that observed statistical interactions are evidence for true molecular epistasis. In this study, we investigate whether there is evidence for epistatic interactions between genetic variants within the cis-regulatory region that influence gene expression after accounting for technical, statistical, and biological confounding factors. We identified 1,119 (FDR = 5%) interactions that appear to regulate gene expression in human lymphoblastoid cell lines, a tightly controlled, largely genetically determined phenotype. Many of these interactions replicated in an independent dataset (90 of 803 tested, Bonferroni threshold). We then performed an exhaustive analysis of both known and novel confounders, including ceiling/floor effects, missing genotype combinations, haplotype effects, single variants tagged through linkage disequilibrium, and population stratification. Every interaction could be explained by at least one of these confounders, and replication in independent datasets did not protect against some confounders. Assuming that the confounding factors provide a more parsimonious explanation for each interaction, we find it unlikely that cis-regulatory interactions contribute strongly to human gene expression, which calls into question the relevance of cis-regulatory interactions for other human phenotypes. We additionally propose several best practices for epistasis testing to protect future studies from confounding.


Assuntos
Artefatos , Epistasia Genética/genética , Regulação da Expressão Gênica/genética , Variação Genética/genética , Modelos Genéticos , Modelos Estatísticos , Sequências Reguladoras de Ácido Nucleico/genética , Sítios de Ligação/genética , Conjuntos de Dados como Assunto , Etnicidade/genética , Haplótipos/genética , Humanos , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Proteínas de Ligação a RNA/genética
4.
Artigo em Inglês | MEDLINE | ID: mdl-25541630

RESUMO

Rarely occurring genetic variants are hypothesized to influence human diseases, but statistically associating these rare variants to disease is challenging due to a lack of statistical power in most feasibly sized datasets. Several statistical tests have been developed to either collapse multiple rare variants from a genomic region into a single variable (presence/absence) or to tally the number of rare alleles within a region, relating the burden of rare alleles to disease risk. Both these approaches, however, rely on user-specification of a genomic region to generate these collapsed or burden variables, usually an entire gene. Recent studies indicate that most risk variants for common diseases are found within regulatory regions, not genes. To capture the effect of rare alleles within non-genic regulatory regions for burden tests, we contrast a simple sliding window approach with a knowledge-guided k-medoids clustering method to group rare variants into statistically powerful, biologically meaningful windows. We apply these methods to detect genomic regions that alter expression of nearby genes.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...