Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nat Genet ; 41(11): 1216-22, 2009 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-19838192

RESUMO

Cis-acting variants altering gene expression are a source of phenotypic differences. The cis-acting components of expression variation can be identified through the mapping of differences in allelic expression (AE), which is the measure of relative expression between two allelic transcripts. We generated a map of AE associated SNPs using quantitative measurements of AE on Illumina Human1M BeadChips. In 53 lymphoblastoid cell lines derived from donors of European descent, we identified common cis variants affecting 30% (2935/9751) of the measured RefSeq transcripts at 0.001 permutation significance. The pervasive influence of cis-regulatory variants, which explain 50% of population variation in AE, extend to full-length transcripts and their isoforms as well as to unannotated transcripts. These strong effects facilitate fine mapping of cis-regulatory SNPs, as demonstrated by dissection of heritable control of transcripts in the systemic lupus erythematosus-associated C8orf13-BLK region in chromosome 8. The dense collection of associations will facilitate large-scale isolation of cis-regulatory SNPs.


Assuntos
Alelos , Variação Genética , Polimorfismo de Nucleotídeo Único , Linhagem Celular , Perfilação da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , Lúpus Eritematoso Sistêmico/genética , Linfócitos/metabolismo , Transcrição Gênica
2.
Am J Hum Genet ; 85(3): 377-93, 2009 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-19732864

RESUMO

Common SNPs in the chromosome 17q12-q21 region alter the risk for asthma, type 1 diabetes, primary biliary cirrhosis, and Crohn disease. Previous reports by us and others have linked the disease-associated genetic variants with changes in expression of GSDMB and ORMDL3 transcripts in human lymphoblastoid cell lines (LCLs). The variants also alter regulation of other transcripts, and this domain-wide cis-regulatory effect suggests a mechanism involving long-range chromatin interactions. Here, we further dissect the disease-linked haplotype and identify putative causal DNA variants via a combination of genetic and functional analyses. First, high-throughput resequencing of the region and genotyping of potential candidate variants were performed. Next, additional mapping of allelic expression differences in Yoruba HapMap LCLs allowed us to fine-map the basis of the cis-regulatory differences to a handful of candidate functional variants. Functional assays identified allele-specific differences in nucleosome distribution, an allele-specific association with the insulator protein CTCF, as well as a weak promoter activity for rs12936231. Overall, this study shows a common disease allele linked to changes in CTCF binding and nucleosome occupancy leading to altered domain-wide cis-regulation. Finally, a strong association between asthma and cis-regulatory haplotypes was observed in three independent family-based cohorts (p = 1.78 x 10(-8)). This study demonstrates the requirement of multiple parallel allele-specific tools for the investigation of noncoding disease variants and functional fine-mapping of human disease-associated haplotypes.


Assuntos
Alelos , Asma/genética , Doenças Autoimunes/genética , Montagem e Desmontagem da Cromatina/genética , Proteínas do Ovo/genética , Proteínas de Membrana/genética , Proteínas de Neoplasias/genética , Adolescente , Asma/complicações , Doenças Autoimunes/complicações , Sequência de Bases , Linhagem Celular , Criança , Cromossomos Humanos Par 17/genética , Análise Mutacional de DNA , Proteínas do Ovo/metabolismo , Feminino , Genes Reporter , Predisposição Genética para Doença , Haplótipos , Humanos , Masculino , Proteínas de Membrana/metabolismo , Dados de Sequência Molecular , Proteínas de Neoplasias/metabolismo , Linhagem , Polimorfismo de Nucleotídeo Único/genética , Sequências Reguladoras de Ácido Nucleico/genética , População Branca/genética
3.
Genome Res ; 19(9): 1542-52, 2009 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-19605794

RESUMO

New high-throughput sequencing technologies are generating large amounts of sequence data, allowing the development of targeted large-scale resequencing studies. For these studies, accurate identification of polymorphic sites is crucial. Heterozygous sites are particularly difficult to identify, especially in regions of low coverage. We present a new strategy for identifying heterozygous sites in a single individual by using a machine learning approach that generates a heterozygosity score for each chromosomal position. Our approach also facilitates the identification of regions with unequal representation of two alleles and other poorly sequenced regions. The availability of confidence scores allows for a principled combination of sequencing results from multiple samples. We evaluate our method on a gold standard data genotype set from HapMap. We are able to classify sites in this data set as heterozygous or homozygous with 98.5% accuracy. In de novo data our probabilistic heterozygote detection ("ProbHD") is able to identify 93% of heterozygous sites at a <5% false call rate (FCR) as estimated based on independent genotyping results. In direct comparison of ProbHD with high-coverage 1000 Genomes sequencing available for a subset of our data, we observe >99.9% overall agreement for genotype calls and close to 90% agreement for heterozygote calls. Overall, our data indicate that high-throughput resequencing of human genomic regions requires careful attention to systematic biases in sample preparation as well as sequence contexts, and that their impact can be alleviated by machine learning-based sequence analyses allowing more accurate extraction of true DNA variants.


Assuntos
Genoma Humano/genética , Polimorfismo de Nucleotídeo Único/genética , Probabilidade , Análise de Sequência de DNA/métodos , Biologia Computacional/métodos , Genótipo , Heterozigoto , Humanos , Modelos Estatísticos
4.
Genome Res ; 19(1): 118-27, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18971308

RESUMO

Regulatory cis-acting variants account for a large proportion of gene expression variability in populations. Cis-acting differences can be specifically measured by comparing relative levels of allelic transcripts within a sample. Allelic expression (AE) mapping for cis-regulatory variant discovery has been hindered by the requirements of having informative or heterozygous single nucleotide polymorphisms (SNPs) within genes in order to assign the allelic origin of each transcript. In this study we have developed an approach to systematically screen for heritable cis-variants in common human haplotypes across >1,000 genes. In order to achieve the highest level of information per haplotype studied, we carried out allelic expression measurements by using both intronic and exonic SNPs in primary transcripts. We used a novel RNA pooling strategy in immortalized lymphoblastoid cell lines (LCLs) and primary human osteoblast cell lines (HObs) to allow for high-throughput AE. Screening hits from RNA pools were further validated by performing allelic expression mapping in individual samples. Our results indicate that >10% of expressed genes in human LCLs show genotype-linked AE. In addition, we have validated cis-acting variants in over 20 genes linked with common disease susceptibility in recent genome-wide studies. More generally, our results indicate that RNA pooling coupled with AE read-out by second generation sequencing or by other methods provides a high-throughput tool for cataloging the impact of common noncoding variants in the human genome.


Assuntos
Variação Genética , Haplótipos , Alelos , Linhagem Celular , Mapeamento Cromossômico , Éxons , Expressão Gênica , Redes Reguladoras de Genes , Teste de Complementação Genética , Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , Íntrons , Linfócitos/metabolismo , Osteoblastos/metabolismo , Polimorfismo de Nucleotídeo Único
5.
J Bioinform Comput Biol ; 6(1): 1-22, 2008 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-18324742

RESUMO

Gene clusters that span three or more chromosomal regions are of increasing importance, yet statistical tests to validate such clusters are in their infancy. Current approaches either conduct several pairwise comparisons or consider only the number of genes that occur in all of the regions. In this paper, we provide statistical tests for clusters spanning exactly three regions based on genome models of typical comparative genomics problems, including analysis of conserved linkage within multiple species and identification of large-scale duplications. Our tests are the first to combine evidence from genes shared among all three regions and genes shared between pairs of regions. We show that our tests of clusters spanning three regions are more sensitive than existing approaches, and can thus be used to identify more diverged homologous regions.


Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , Interpretação Estatística de Dados , Família Multigênica/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Sequência de Bases , Dados de Sequência Molecular , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
6.
Trends Genet ; 22(3): 156-64, 2006 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-16442663

RESUMO

New genes arise through duplication and modification of DNA sequences on a range of scales: single gene duplication, duplication of large chromosomal fragments and whole-genome duplication. Each duplication mechanism has specific characteristics that influence the fate of the resulting duplicates, such as the size of the duplicated fragment, the potential for dosage imbalance, the preservation or disruption of regulatory control and genomic context. The ability to diagnose or identify the mechanism that produced a pair of paralogs has the potential to increase our ability to reconstruct evolutionary history, to understand the processes that govern genome evolution and to make functional predictions based on paralogy. The recent availability of large amounts of whole-genome sequence, often from several closely related species, has stimulated a wealth of new computational methods to diagnose gene duplications.


Assuntos
Evolução Molecular , Duplicação Gênica , Genes/fisiologia , Genoma , Animais , Humanos
7.
J Comput Biol ; 12(8): 1083-102, 2005 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-16241899

RESUMO

Statistical validation of gene clusters is imperative for many important applications in comparative genomics which depend on the identification of genomic regions that are historically and/or functionally related. We develop the first rigorous statistical treatment of max-gap clusters, a cluster definition frequently used in empirical studies. We present exact expressions for the probability of observing an individual cluster of a set of marked genes in one genome, as well as upper and lower bounds on the probability of observing a cluster of h homologs in a pairwise whole-genome comparison. We demonstrate the utility of our approach by applying it to a whole-genome comparison of E. coli and B. subtilis. Code for statistical tests is available at.


Assuntos
Análise por Conglomerados , Biologia Computacional , Interpretação Estatística de Dados , Genoma , Análise de Sequência de DNA , Bacillus subtilis/genética , Mapeamento Cromossômico , Escherichia coli/genética , Evolução Molecular , Genes , Genômica , Modelos Genéticos , Família Multigênica , Probabilidade , Alinhamento de Sequência
8.
Appl Bioinformatics ; 3(2-3): 167-79, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15693742

RESUMO

In this study, we attempt to understand and explain positional selection pressure in terms of underlying physical and chemical properties. We propose a set of constraining assumptions about how these pressures behave, then describe a procedure for analysing and explaining the distribution of residues at a particular position in a multiple sequence alignment. In contrast to previous approaches, our model takes into account both amino acid frequencies and a large number of physical-chemical properties. By analysing each property separately, it is possible to identify positions where distinct conservation patterns are present. In addition, the model can easily incorporate sequence weights that adjust for bias in the sample sequences. Finally, a test of statistical significance is provided for our conservation measure. The applicability of this method is demonstrated on two HIV-1 proteins: Nef and Env. The tools, data and results presented in this article are available at http://flan.blm.cs.cmu.edu.


Assuntos
Evolução Molecular , Proteínas/química , Proteínas/genética , Seleção Genética , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Algoritmos , Sequência de Aminoácidos , Aminoácidos/química , Aminoácidos/genética , Sequência Conservada/genética , Dados de Sequência Molecular , Homologia de Sequência de Aminoácidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...