Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
Genome Res ; 31(12): 2249-2257, 2021 Dec.
Article in English | MEDLINE | ID: mdl-34544830

ABSTRACT

Structural variants (SVs) are an important source of human genome diversity, but their functional effects are poorly understood. We mapped 61,668 SVs in 613 individuals from the GTEx project and measured their effects on gene expression. We estimate that common SVs are causal at 2.66% of eQTLs, a 10.5-fold enrichment relative to their abundance in the genome. Duplications and deletions were the most impactful variant types, whereas the contribution of mobile element insertions was small (0.12% of eQTLs, 1.9-fold enriched). Multitissue analysis of eQTLs revealed that gene-altering SVs show more constitutive effects than other variant types, with 62.09% of coding SV-eQTLs active in all tissues with eQTL activity compared with 23.08% of coding SNV- and indel-eQTLs. Noncoding SVs, SNVs and indels show broadly similar patterns. We also identified 539 rare SVs associated with nearby gene expression outliers. Of these, 62.34% are noncoding SVs that affect gene expression but have modest enrichment at regulatory elements, showing that rare noncoding SVs are a major source of gene expression differences but remain difficult to predict from current annotations. Both common and rare SVs often affect the expression of multiple genes: SV-eQTLs affect an average of 1.82 nearby genes, whereas SNV- and indel-eQTLs affect an average of 1.09 genes, and 21.34% of rare expression-altering SVs show effects on two to nine different genes. We also observe significant effects on rare gene expression changes extending 1 Mb from the SV. This provides a mechanism by which individual SVs may have strong or pleiotropic effects on phenotypic variation.

2.
Am J Hum Genet ; 108(4): 583-596, 2021 04 01.
Article in English | MEDLINE | ID: mdl-33798444

ABSTRACT

The contribution of genome structural variation (SV) to quantitative traits associated with cardiometabolic diseases remains largely unknown. Here, we present the results of a study examining genetic association between SVs and cardiometabolic traits in the Finnish population. We used sensitive methods to identify and genotype 129,166 high-confidence SVs from deep whole-genome sequencing (WGS) data of 4,848 individuals. We tested the 64,572 common and low-frequency SVs for association with 116 quantitative traits and tested candidate associations using exome sequencing and array genotype data from an additional 15,205 individuals. We discovered 31 genome-wide significant associations at 15 loci, including 2 loci at which SVs have strong phenotypic effects: (1) a deletion of the ALB promoter that is greatly enriched in the Finnish population and causes decreased serum albumin level in carriers (p = 1.47 × 10-54) and is also associated with increased levels of total cholesterol (p = 1.22 × 10-28) and 14 additional cholesterol-related traits, and (2) a multi-allelic copy number variant (CNV) at PDPR that is strongly associated with pyruvate (p = 4.81 × 10-21) and alanine (p = 6.14 × 10-12) levels and resides within a structurally complex genomic region that has accumulated many rearrangements over evolutionary time. We also confirmed six previously reported associations, including five led by stronger signals in single nucleotide variants (SNVs) and one linking recurrent HP gene deletion and cholesterol levels (p = 6.24 × 10-10), which was also found to be strongly associated with increased glycoprotein level (p = 3.53 × 10-35). Our study confirms that integrating SVs in trait-mapping studies will expand our knowledge of genetic factors underlying disease risk.


Subject(s)
Cardiovascular Diseases/genetics , Genomic Structural Variation/genetics , Alleles , Cholesterol/blood , DNA Copy Number Variations/genetics , Female , Finland , Genome, Human/genetics , Genotype , High-Throughput Nucleotide Sequencing , Humans , Male , Mitochondrial Proteins/genetics , Promoter Regions, Genetic/genetics , Pyruvate Dehydrogenase (Lipoamide)-Phosphatase/genetics , Pyruvic Acid/metabolism , Serum Albumin, Human/genetics
3.
Cell ; 184(10): 2633-2648.e19, 2021 05 13.
Article in English | MEDLINE | ID: mdl-33864768

ABSTRACT

Long non-coding RNA (lncRNA) genes have well-established and important impacts on molecular and cellular functions. However, among the thousands of lncRNA genes, it is still a major challenge to identify the subset with disease or trait relevance. To systematically characterize these lncRNA genes, we used Genotype Tissue Expression (GTEx) project v8 genetic and multi-tissue transcriptomic data to profile the expression, genetic regulation, cellular contexts, and trait associations of 14,100 lncRNA genes across 49 tissues for 101 distinct complex genetic traits. Using these approaches, we identified 1,432 lncRNA gene-trait associations, 800 of which were not explained by stronger effects of neighboring protein-coding genes. This included associations between lncRNA quantitative trait loci and inflammatory bowel disease, type 1 and type 2 diabetes, and coronary artery disease, as well as rare variant associations to body mass index.


Subject(s)
Disease/genetics , Multifactorial Inheritance/genetics , Population/genetics , RNA, Long Noncoding/genetics , Transcriptome , Coronary Artery Disease/genetics , Diabetes Mellitus, Type 1/genetics , Diabetes Mellitus, Type 2/genetics , Gene Expression Profiling , Genetic Variation , Humans , Inflammatory Bowel Diseases/genetics , Organ Specificity/genetics , Quantitative Trait Loci
4.
Science ; 369(6509)2020 09 11.
Article in English | MEDLINE | ID: mdl-32913073

ABSTRACT

Rare genetic variants are abundant across the human genome, and identifying their function and phenotypic impact is a major challenge. Measuring aberrant gene expression has aided in identifying functional, large-effect rare variants (RVs). Here, we expanded detection of genetically driven transcriptome abnormalities by analyzing gene expression, allele-specific expression, and alternative splicing from multitissue RNA-sequencing data, and demonstrate that each signal informs unique classes of RVs. We developed Watershed, a probabilistic model that integrates multiple genomic and transcriptomic signals to predict variant function, validated these predictions in additional cohorts and through experimental assays, and used them to assess RVs in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. Our results link thousands of RVs to diverse molecular effects and provide evidence to associate RVs affecting the transcriptome with human traits.


Subject(s)
Genetic Variation , Genome, Human , Multifactorial Inheritance , Transcriptome , Humans , Organ Specificity
5.
Cell ; 177(1): 70-84, 2019 03 21.
Article in English | MEDLINE | ID: mdl-30901550

ABSTRACT

Affordable genome sequencing technologies promise to revolutionize the field of human genetics by enabling comprehensive studies that interrogate all classes of genome variation, genome-wide, across the entire allele frequency spectrum. Ongoing projects worldwide are sequencing many thousands-and soon millions-of human genomes as part of various gene mapping studies, biobanking efforts, and clinical programs. However, while genome sequencing data production has become routine, genome analysis and interpretation remain challenging endeavors with many limitations and caveats. Here, we review the current state of technologies for genetic variant discovery, genotyping, and functional interpretation and discuss the prospects for future advances. We focus on germline variants discovered by whole-genome sequencing, genome-wide functional genomic approaches for predicting and measuring variant functional effects, and implications for studies of common and rare human disease.


Subject(s)
Genetic Variation/genetics , Genome, Human/genetics , Sequence Analysis, DNA/trends , Biological Specimen Banks , Chromosome Mapping/methods , Genetic Predisposition to Disease/genetics , Genetic Testing/trends , Genome-Wide Association Study , Genomics/methods , Genomics/trends , High-Throughput Nucleotide Sequencing/methods , Human Genome Project , Humans , Polymorphism, Single Nucleotide/genetics , Sequence Analysis, DNA/methods , Whole Genome Sequencing/methods , Whole Genome Sequencing/trends
6.
Nature ; 550(7675): 239-243, 2017 10 11.
Article in English | MEDLINE | ID: mdl-29022581

ABSTRACT

Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.


Subject(s)
Gene Expression Profiling , Genetic Variation/genetics , Organ Specificity/genetics , Bayes Theorem , Female , Genome, Human/genetics , Genomics , Genotype , Humans , Male , Models, Genetic , Sequence Analysis, RNA
7.
Nat Genet ; 49(5): 692-699, 2017 May.
Article in English | MEDLINE | ID: mdl-28369037

ABSTRACT

Structural variants (SVs) are an important source of human genetic diversity, but their contribution to traits, disease and gene regulation remains unclear. We mapped cis expression quantitative trait loci (eQTLs) in 13 tissues via joint analysis of SVs, single-nucleotide variants (SNVs) and short insertion/deletion (indel) variants from deep whole-genome sequencing (WGS). We estimated that SVs are causal at 3.5-6.8% of eQTLs-a substantially higher fraction than prior estimates-and that expression-altering SVs have larger effect sizes than do SNVs and indels. We identified 789 putative causal SVs predicted to directly alter gene expression: most (88.3%) were noncoding variants enriched at enhancers and other regulatory elements, and 52 were linked to genome-wide association study loci. We observed a notable abundance of rare high-impact SVs associated with aberrant expression of nearby genes. These results suggest that comprehensive WGS-based SV analyses will increase the power of common- and rare-variant association studies.


Subject(s)
Gene Expression Regulation , Genetic Variation , Genome, Human/genetics , Quantitative Trait Loci/genetics , Sequence Analysis, DNA/methods , Algorithms , Chromosome Mapping , Genome-Wide Association Study/methods , Humans , INDEL Mutation , Linear Models , Polymorphism, Single Nucleotide
8.
Bioinformatics ; 28(7): 1033-4, 2012 Apr 01.
Article in English | MEDLINE | ID: mdl-22332237

ABSTRACT

SUMMARY: With the explosive growth of bacterial and archaeal sequence data, large-scale phylogenetic analyses present both opportunities and challenges. Here we describe AMPHORA2, an automated phylogenomic inference tool that can be used for high-throughput, high-quality genome tree reconstruction and metagenomic phylotyping. Compared with its predecessor, AMPHORA2 has several major enhancements and new functions: it has a greatly expanded phylogenetic marker database and can analyze both bacterial and archaeal sequences; it incorporates probability-based sequence alignment masks that improve the phylogenetic accuracy; it can analyze DNA as well as protein sequences and is more sensitive in marker identification; finally, it is over 100× faster in metagenomic phylotyping. AVAILABILITY: http://wolbachia.biology.virginia.edu/WuLab/Software.html. CONTACT: mw4yv@virginia.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Archaea/genetics , Bacteria/genetics , Computational Biology/methods , Genomics/methods , Phylogeny , Algorithms , Archaea/classification , Bacteria/classification , Genome, Archaeal , Genome, Bacterial , Metagenome , Sequence Alignment , Sequence Analysis, DNA/methods , Sequence Analysis, Protein/methods , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...