Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Proc Natl Acad Sci U S A ; 119(31): e2121279119, 2022 08 02.
Article in English | MEDLINE | ID: mdl-35905320

ABSTRACT

Genetically informed, deep-phenotyped biobanks are an important research resource and it is imperative that the most powerful, versatile, and efficient analysis approaches are used. Here, we apply our recently developed Bayesian grouped mixture of regressions model (GMRM) in the UK and Estonian Biobanks and obtain the highest genomic prediction accuracy reported to date across 21 heritable traits. When compared to other approaches, GMRM accuracy was greater than annotation prediction models run in the LDAK or LDPred-funct software by 15% (SE 7%) and 14% (SE 2%), respectively, and was 18% (SE 3%) greater than a baseline BayesR model without single-nucleotide polymorphism (SNP) markers grouped into minor allele frequency-linkage disequilibrium (MAF-LD) annotation categories. For height, the prediction accuracy R2 was 47% in a UK Biobank holdout sample, which was 76% of the estimated [Formula: see text]. We then extend our GMRM prediction model to provide mixed-linear model association (MLMA) SNP marker estimates for genome-wide association (GWAS) discovery, which increased the independent loci detected to 16,162 in unrelated UK Biobank individuals, compared to 10,550 from BoltLMM and 10,095 from Regenie, a 62 and 65% increase, respectively. The average [Formula: see text] value of the leading markers increased by 15.24 (SE 0.41) for every 1% increase in prediction accuracy gained over a baseline BayesR model across the traits. Thus, we show that modeling genetic associations accounting for MAF and LD differences among SNP markers, and incorporating prior knowledge of genomic function, is important for both genomic prediction and discovery in large-scale individual-level studies.


Subject(s)
Databases, Genetic , Genome-Wide Association Study , Precision Medicine , Quantitative Trait, Heritable , Bayes Theorem , England , Estonia , Genomics , Genotype , Humans , Phenotype , Polymorphism, Single Nucleotide
2.
Nat Commun ; 12(1): 2337, 2021 04 20.
Article in English | MEDLINE | ID: mdl-33879782

ABSTRACT

While recent advancements in computation and modelling have improved the analysis of complex traits, our understanding of the genetic basis of the time at symptom onset remains limited. Here, we develop a Bayesian approach (BayesW) that provides probabilistic inference of the genetic architecture of age-at-onset phenotypes in a sampling scheme that facilitates biobank-scale time-to-event analyses. We show in extensive simulation work the benefits BayesW provides in terms of number of discoveries, model performance and genomic prediction. In the UK Biobank, we find many thousands of common genomic regions underlying the age-at-onset of high blood pressure (HBP), cardiac disease (CAD), and type-2 diabetes (T2D), and for the genetic basis of onset reflecting the underlying genetic liability to disease. Age-at-menopause and age-at-menarche are also highly polygenic, but with higher variance contributed by low frequency variants. Genomic prediction into the Estonian Biobank data shows that BayesW gives higher prediction accuracy than other approaches.


Subject(s)
Age of Onset , Genome, Human , Models, Genetic , Multifactorial Inheritance , Age Factors , Algorithms , Bayes Theorem , Cardiovascular Diseases/genetics , Computer Simulation , Databases, Genetic , Diabetes Mellitus, Type 2/genetics , Estonia , Female , Genetic Association Studies , Genome-Wide Association Study , Genomics , Humans , Hypertension/genetics , Menarche/genetics , Menopause/genetics , Phenotype , Polymorphism, Single Nucleotide , United Kingdom
3.
Genome Med ; 12(1): 60, 2020 07 08.
Article in English | MEDLINE | ID: mdl-32641083

ABSTRACT

BACKGROUND: The molecular factors which control circulating levels of inflammatory proteins are not well understood. Furthermore, association studies between molecular probes and human traits are often performed by linear model-based methods which may fail to account for complex structure and interrelationships within molecular datasets. METHODS: In this study, we perform genome- and epigenome-wide association studies (GWAS/EWAS) on the levels of 70 plasma-derived inflammatory protein biomarkers in healthy older adults (Lothian Birth Cohort 1936; n = 876; Olink® inflammation panel). We employ a Bayesian framework (BayesR+) which can account for issues pertaining to data structure and unknown confounding variables (with sensitivity analyses using ordinary least squares- (OLS) and mixed model-based approaches). RESULTS: We identified 13 SNPs associated with 13 proteins (n = 1 SNP each) concordant across OLS and Bayesian methods. We identified 3 CpG sites spread across 3 proteins (n = 1 CpG each) that were concordant across OLS, mixed-model and Bayesian analyses. Tagged genetic variants accounted for up to 45% of variance in protein levels (for MCP2, 36% of variance alone attributable to 1 polymorphism). Methylation data accounted for up to 46% of variation in protein levels (for CXCL10). Up to 66% of variation in protein levels (for VEGFA) was explained using genetic and epigenetic data combined. We demonstrated putative causal relationships between CD6 and IL18R1 with inflammatory bowel disease and between IL12B and Crohn's disease. CONCLUSIONS: Our data may aid understanding of the molecular regulation of the circulating inflammatory proteome as well as causal relationships between inflammatory mediators and disease.


Subject(s)
Biomarkers , Epigenomics , Genome-Wide Association Study , Genomics , Proteins/genetics , Age Factors , Aged , Aged, 80 and over , Blood Proteins/genetics , Computational Biology/methods , DNA Methylation , Disease Susceptibility , Epigenesis, Genetic , Epigenomics/methods , Female , Gene Expression Regulation , Genomics/methods , Healthy Volunteers , Humans , Inflammation/etiology , Inflammation/metabolism , Inflammation Mediators , Male , Middle Aged , Polymorphism, Single Nucleotide , Proteins/metabolism , Quantitative Trait Loci
4.
Nat Commun ; 11(1): 2865, 2020 06 08.
Article in English | MEDLINE | ID: mdl-32513961

ABSTRACT

Linking epigenetic marks to clinical outcomes improves insight into molecular processes, disease prediction, and therapeutic target identification. Here, a statistical approach is presented to infer the epigenetic architecture of complex disease, determine the variation captured by epigenetic effects, and estimate phenotype-epigenetic probe associations jointly. Implicitly adjusting for probe correlations, data structure (cell-count or relatedness), and single-nucleotide polymorphism (SNP) marker effects, improves association estimates and in 9,448 individuals, 75.7% (95% CI 71.70-79.3) of body mass index (BMI) variation and 45.6% (95% CI 37.3-51.9) of cigarette consumption variation was captured by whole blood methylation array data. Pathway-linked probes of blood cholesterol, lipid transport and sterol metabolism for BMI, and xenobiotic stimuli response for smoking, showed >1.5 times larger associations with >95% posterior inclusion probability. Prediction accuracy improved by 28.7% for BMI and 10.2% for smoking over a LASSO model, with age-, and tissue-specificity, implying associations are a phenotypic consequence rather than causal.


Subject(s)
Epigenesis, Genetic , Quantitative Trait, Heritable , Adult , Algorithms , Bayes Theorem , Biomarkers/analysis , Body Mass Index , Computer Simulation , DNA Methylation/genetics , Humans , Molecular Sequence Annotation , Organ Specificity/genetics , Reproducibility of Results
5.
Proc Natl Acad Sci U S A ; 116(36): 18142-18147, 2019 09 03.
Article in English | MEDLINE | ID: mdl-31420515

ABSTRACT

One of the most challenging tasks in modern science is the development of systems biology models: Existing models are often very complex but generally have low predictive performance. The construction of high-fidelity models will require hundreds/thousands of cycles of model improvement, yet few current systems biology research studies complete even a single cycle. We combined multiple software tools with integrated laboratory robotics to execute three cycles of model improvement of the prototypical eukaryotic cellular transformation, the yeast (Saccharomyces cerevisiae) diauxic shift. In the first cycle, a model outperforming the best previous diauxic shift model was developed using bioinformatic and systems biology tools. In the second cycle, the model was further improved using automatically planned experiments. In the third cycle, hypothesis-led experiments improved the model to a greater extent than achieved using high-throughput experiments. All of the experiments were formalized and communicated to a cloud laboratory automation system (Eve) for automatic execution, and the results stored on the semantic web for reuse. The final model adds a substantial amount of knowledge about the yeast diauxic shift: 92 genes (+45%), and 1,048 interactions (+147%). This knowledge is also relevant to understanding cancer, the immune system, and aging. We conclude that systems biology software tools can be combined and integrated with laboratory robots in closed-loop cycles.


Subject(s)
Computational Biology , Gene Expression Regulation, Fungal , Robotics , Saccharomyces cerevisiae , Software , Systems Biology , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism
6.
Bioinformatics ; 31(22): 3617-24, 2015 Nov 15.
Article in English | MEDLINE | ID: mdl-26177966

ABSTRACT

MOTIVATION: Oscillations lie at the core of many biological processes, from the cell cycle, to circadian oscillations and developmental processes. Time-keeping mechanisms are essential to enable organisms to adapt to varying conditions in environmental cycles, from day/night to seasonal. Transcriptional regulatory networks are one of the mechanisms behind these biological oscillations. However, while identifying cyclically expressed genes from time series measurements is relatively easy, determining the structure of the interaction network underpinning the oscillation is a far more challenging problem. RESULTS: Here, we explicitly leverage the oscillatory nature of the transcriptional signals and present a method for reconstructing network interactions tailored to this special but important class of genetic circuits. Our method is based on projecting the signal onto a set of oscillatory basis functions using a Discrete Fourier Transform. We build a Bayesian Hierarchical model within a frequency domain linear model in order to enforce sparsity and incorporate prior knowledge about the network structure. Experiments on real and simulated data show that the method can lead to substantial improvements over competing approaches if the oscillatory assumption is met, and remains competitive also in cases it is not. AVAILABILITY: DSS, experiment scripts and data are available at http://homepages.inf.ed.ac.uk/gsanguin/DSS.zip. CONTACT: d.trejo-banos@sms.ed.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Gene Regulatory Networks , Arabidopsis/genetics , Bayes Theorem , Cell Cycle/genetics , Circadian Clocks/genetics , Computer Simulation , Databases, Genetic , Gene Expression Profiling , Gene Expression Regulation, Fungal , Gene Expression Regulation, Plant , Saccharomyces cerevisiae/cytology , Saccharomyces cerevisiae/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...