Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 97
Filtrar
1.
Nat Med ; 30(5): 1384-1394, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38740997

RESUMO

How human genetic variation contributes to vaccine effectiveness in infants is unclear, and data are limited on these relationships in populations with African ancestries. We undertook genetic analyses of vaccine antibody responses in infants from Uganda (n = 1391), Burkina Faso (n = 353) and South Africa (n = 755), identifying associations between human leukocyte antigen (HLA) and antibody response for five of eight tested antigens spanning pertussis, diphtheria and hepatitis B vaccines. In addition, through HLA typing 1,702 individuals from 11 populations of African ancestry derived predominantly from the 1000 Genomes Project, we constructed an imputation resource, fine-mapping class II HLA-DR and DQ associations explaining up to 10% of antibody response variance in our infant cohorts. We observed differences in the genetic architecture of pertussis antibody response between the cohorts with African ancestries and an independent cohort with European ancestry, but found no in silico evidence of differences in HLA peptide binding affinity or breadth. Using immune cell expression quantitative trait loci datasets derived from African-ancestry samples from the 1000 Genomes Project, we found evidence of differential HLA-DRB1 expression correlating with inferred protection from pertussis following vaccination. This work suggests that HLA-DRB1 expression may play a role in vaccine response and should be considered alongside peptide selection to improve vaccine design.


Assuntos
Cadeias HLA-DRB1 , Humanos , Cadeias HLA-DRB1/genética , Cadeias HLA-DRB1/imunologia , Lactente , População Negra/genética , Vacinas contra Hepatite B/imunologia , Locos de Características Quantitativas , Masculino , Feminino , Uganda , Formação de Anticorpos/genética , Formação de Anticorpos/imunologia , Vacina contra Coqueluche/imunologia , Vacina contra Coqueluche/genética , Vacinação , Coqueluche/prevenção & controle , Coqueluche/imunologia , Coqueluche/genética
2.
Am J Hum Genet ; 111(2): 295-308, 2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38232728

RESUMO

Infectious agents contribute significantly to the global burden of diseases through both acute infection and their chronic sequelae. We leveraged the UK Biobank to identify genetic loci that influence humoral immune response to multiple infections. From 45 genome-wide association studies in 9,611 participants from UK Biobank, we identified NFKB1 as a locus associated with quantitative antibody responses to multiple pathogens, including those from the herpes, retro-, and polyoma-virus families. An insertion-deletion variant thought to affect NFKB1 expression (rs28362491), was mapped as the likely causal variant and could play a key role in regulation of the immune response. Using 121 infection- and inflammation-related traits in 487,297 UK Biobank participants, we show that the deletion allele was associated with an increased risk of infection from diverse pathogens but had a protective effect against allergic disease. We propose that altered expression of NFKB1, as a result of the deletion, modulates hematopoietic pathways and likely impacts cell survival, antibody production, and inflammation. Taken together, we show that disruptions to the tightly regulated immune processes may tip the balance between exacerbated immune responses and allergy, or increased risk of infection and impaired resolution of inflammation.


Assuntos
Predisposição Genética para Doença , Hipersensibilidade , Inflamação , Humanos , Estudo de Associação Genômica Ampla , Hipersensibilidade/genética , Inflamação/genética , Subunidade p50 de NF-kappa B/genética , Biobanco do Reino Unido
3.
Nat Genet ; 55(11): 1854-1865, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37814053

RESUMO

The analysis of longitudinal data from electronic health records (EHRs) has the potential to improve clinical diagnoses and enable personalized medicine, motivating efforts to identify disease subtypes from patient comorbidity information. Here we introduce an age-dependent topic modeling (ATM) method that provides a low-rank representation of longitudinal records of hundreds of distinct diseases in large EHR datasets. We applied ATM to 282,957 UK Biobank samples, identifying 52 diseases with heterogeneous comorbidity profiles; analyses of 211,908 All of Us samples produced concordant results. We defined subtypes of the 52 heterogeneous diseases based on their comorbidity profiles and compared genetic risk across disease subtypes using polygenic risk scores (PRSs), identifying 18 disease subtypes whose PRS differed significantly from other subtypes of the same disease. We further identified specific genetic variants with subtype-dependent effects on disease risk. In conclusion, ATM identifies disease subtypes with differential genome-wide and locus-specific genetic risk profiles.


Assuntos
Predisposição Genética para Doença , Saúde da População , Humanos , Bancos de Espécimes Biológicos , Estudo de Associação Genômica Ampla/métodos , Fatores de Risco , Comorbidade , Herança Multifatorial/genética , Reino Unido/epidemiologia
4.
Cell Genom ; 3(8): 100371, 2023 Aug 09.
Artigo em Inglês | MEDLINE | ID: mdl-37601973

RESUMO

Many diseases show patterns of co-occurrence, possibly driven by systemic dysregulation of underlying processes affecting multiple traits. We have developed a method (treeLFA) for identifying such multimorbidities from routine health-care data, which combines topic modeling with an informative prior derived from medical ontology. We apply treeLFA to UK Biobank data and identify a variety of topics representing multimorbidity clusters, including a healthy topic. We find that loci identified using topic weights as traits in a genome-wide association study (GWAS) analysis, which we validated with a range of approaches, only partially overlap with loci from GWASs on constituent single diseases. We also show that treeLFA improves upon existing methods like latent Dirichlet allocation in various ways. Overall, our findings indicate that topic models can characterize multimorbidity patterns and that genetic analysis of these patterns can provide insight into the etiology of complex traits that cannot be determined from the analysis of constituent traits alone.

5.
Nat Commun ; 14(1): 4023, 2023 07 07.
Artigo em Inglês | MEDLINE | ID: mdl-37419925

RESUMO

Polygenic scores (PGSs) are individual-level measures that aggregate the genome-wide genetic predisposition to a given trait. As PGS have predominantly been developed using European-ancestry samples, trait prediction using such European ancestry-derived PGS is less accurate in non-European ancestry individuals. Although there has been recent progress in combining multiple PGS trained on distinct populations, the problem of how to maximize performance given a multiple-ancestry cohort is largely unexplored. Here, we investigate the effect of sample size and ancestry composition on PGS performance for fifteen traits in UK Biobank. For some traits, PGS estimated using a relatively small African-ancestry training set outperformed, on an African-ancestry test set, PGS estimated using a much larger European-ancestry only training set. We observe similar, but not identical, results when considering other minority-ancestry groups within UK Biobank. Our results emphasise the importance of targeted data collection from underrepresented groups in order to address existing disparities in PGS performance.


Assuntos
População Negra , Genética Populacional , Herança Multifatorial , Humanos , População Negra/genética , Coleta de Dados , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Grupos Minoritários
6.
Nat Commun ; 13(1): 4398, 2022 07 29.
Artigo em Inglês | MEDLINE | ID: mdl-35906236

RESUMO

Fetal growth restriction (FGR) affects 5-10% of pregnancies, and can have serious consequences for both mother and child. Prevention and treatment are limited because FGR pathogenesis is poorly understood. Genetic studies implicate KIR and HLA genes in FGR, however, linkage disequilibrium, genetic influence from both parents, and challenges with investigating human pregnancies make the risk alleles and their functional effects difficult to map. Here, we demonstrate that the interaction between the maternal KIR2DL1, expressed on uterine natural killer (NK) cells, and the paternally inherited HLA-C*0501, expressed on fetal trophoblast cells, leads to FGR in a humanized mouse model. We show that the KIR2DL1 and C*0501 interaction leads to pathogenic uterine arterial remodeling and modulation of uterine NK cell function. This initial effect cascades to altered transcriptional expression and intercellular communication at the maternal-fetal interface. These findings provide mechanistic insight into specific FGR risk alleles, and provide avenues of prevention and treatment.


Assuntos
Retardo do Crescimento Fetal , Trofoblastos , Animais , Comunicação Celular/genética , Feminino , Retardo do Crescimento Fetal/genética , Retardo do Crescimento Fetal/metabolismo , Feto/metabolismo , Antígenos HLA-C/genética , Antígenos HLA-C/metabolismo , Camundongos , Gravidez , Trofoblastos/metabolismo
7.
PLoS Biol ; 20(5): e3001669, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35639797

RESUMO

The field of population genomics has grown rapidly in response to the recent advent of affordable, large-scale sequencing technologies. As opposed to the situation during the majority of the 20th century, in which the development of theoretical and statistical population genetic insights outpaced the generation of data to which they could be applied, genomic data are now being produced at a far greater rate than they can be meaningfully analyzed and interpreted. With this wealth of data has come a tendency to focus on fitting specific (and often rather idiosyncratic) models to data, at the expense of a careful exploration of the range of possible underlying evolutionary processes. For example, the approach of directly investigating models of adaptive evolution in each newly sequenced population or species often neglects the fact that a thorough characterization of ubiquitous nonadaptive processes is a prerequisite for accurate inference. We here describe the perils of these tendencies, present our consensus views on current best practices in population genomic data analysis, and highlight areas of statistical inference and theory that are in need of further attention. Thereby, we argue for the importance of defining a biologically relevant baseline model tuned to the details of each new analysis, of skepticism and scrutiny in interpreting model fitting results, and of carefully defining addressable hypotheses and underlying uncertainties.


Assuntos
Genômica , Metagenômica , Genômica/métodos
8.
Nat Commun ; 13(1): 1818, 2022 04 05.
Artigo em Inglês | MEDLINE | ID: mdl-35383168

RESUMO

Certain infectious agents are recognised causes of cancer and other chronic diseases. To understand the pathological mechanisms underlying such relationships, here we design a Multiplex Serology platform to measure quantitative antibody responses against 45 antigens from 20 infectious agents including human herpes, hepatitis, polyoma, papilloma, and retroviruses, as well as Chlamydia trachomatis, Helicobacter pylori and Toxoplasma gondii, then assayed a random subset of 9695 UK Biobank participants. We find seroprevalence estimates consistent with those expected from prior literature and confirm multiple associations of antibody responses with sociodemographic characteristics (e.g., lifetime sexual partners with C. trachomatis), HLA genetic variants (rs6927022 with Epstein-Barr virus (EBV) EBNA1 antibodies) and disease outcomes (human papillomavirus-16 seropositivity with cervical intraepithelial neoplasia, and EBV responses with multiple sclerosis). Our accessible dataset is one of the largest incorporating diverse infectious agents in a prospective UK cohort offering opportunities to improve our understanding of host-pathogen-disease relationships with significant clinical and public health implications.


Assuntos
Infecções por Vírus Epstein-Barr , Neoplasias do Colo do Útero , Bancos de Espécimes Biológicos , Feminino , Herpesvirus Humano 4/genética , Humanos , Estudos Prospectivos , Estudos Soroepidemiológicos , Reino Unido/epidemiologia
9.
Science ; 375(6583): eabi8264, 2022 02 25.
Artigo em Inglês | MEDLINE | ID: mdl-35201891

RESUMO

The sequencing of modern and ancient genomes from around the world has revolutionized our understanding of human history and evolution. However, the problem of how best to characterize ancestral relationships from the totality of human genomic variation remains unsolved. Here, we address this challenge with nonparametric methods that enable us to infer a unified genealogy of modern and ancient humans. This compact representation of multiple datasets explores the challenges of missing and erroneous data and uses ancient samples to constrain and date relationships. We demonstrate the power of the method to recover relationships between individuals and populations as well as to identify descendants of ancient samples. Finally, we introduce a simple nonparametric estimator of the geographical location of ancestors that recapitulates key events in human history.


Assuntos
DNA Antigo , Genoma Humano , Genômica , Linhagem , África , Cromossomos Humanos Par 20/genética , Simulação por Computador , Bases de Dados de Ácidos Nucleicos , Conjuntos de Dados como Assunto , Evolução Molecular , Variação Genética , Genética Populacional , Geografia , Haplótipos , Migração Humana , Humanos , Mutação , Análise de Sequência de DNA , Análise Espaço-Temporal , Estatísticas não Paramétricas
10.
Nat Genet ; 53(11): 1543-1552, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34741163

RESUMO

Irritable bowel syndrome (IBS) results from disordered brain-gut interactions. Identifying susceptibility genes could highlight the underlying pathophysiological mechanisms. We designed a digestive health questionnaire for UK Biobank and combined identified cases with IBS with independent cohorts. We conducted a genome-wide association study with 53,400 cases and 433,201 controls and replicated significant associations in a 23andMe panel (205,252 cases and 1,384,055 controls). Our study identified and confirmed six genetic susceptibility loci for IBS. Implicated genes included NCAM1, CADM2, PHF2/FAM120A, DOCK9, CKAP2/TPTE2P3 and BAG6. The first four are associated with mood and anxiety disorders, expressed in the nervous system, or both. Mirroring this, we also found strong genome-wide correlation between the risk of IBS and anxiety, neuroticism and depression (rg > 0.5). Additional analyses suggested this arises due to shared pathogenic pathways rather than, for example, anxiety causing abdominal symptoms. Implicated mechanisms require further exploration to help understand the altered brain-gut interactions underlying IBS.


Assuntos
Transtornos de Ansiedade/genética , Síndrome do Intestino Irritável/genética , Transtornos do Humor/genética , Idoso , Antígeno CD56/genética , Moléculas de Adesão Celular/genética , Proteínas do Citoesqueleto/genética , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Fatores de Troca do Nucleotídeo Guanina/genética , Proteínas de Homeodomínio/genética , Humanos , Síndrome do Intestino Irritável/epidemiologia , Masculino , Pessoa de Meia-Idade , Chaperonas Moleculares/genética , Polimorfismo de Nucleotídeo Único , Reino Unido/epidemiologia
11.
PLoS Comput Biol ; 17(8): e1009287, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34411093

RESUMO

There is an abundance of malaria genetic data being collected from the field, yet using these data to understand the drivers of regional epidemiology remains a challenge. A key issue is the lack of models that relate parasite genetic diversity to epidemiological parameters. Classical models in population genetics characterize changes in genetic diversity in relation to demographic parameters, but fail to account for the unique features of the malaria life cycle. In contrast, epidemiological models, such as the Ross-Macdonald model, capture malaria transmission dynamics but do not consider genetics. Here, we have developed an integrated model encompassing both parasite evolution and regional epidemiology. We achieve this by combining the Ross-Macdonald model with an intra-host continuous-time Moran model, thus explicitly representing the evolution of individual parasite genomes in a traditional epidemiological framework. Implemented as a stochastic simulation, we use the model to explore relationships between measures of parasite genetic diversity and parasite prevalence, a widely-used metric of transmission intensity. First, we explore how varying parasite prevalence influences genetic diversity at equilibrium. We find that multiple genetic diversity statistics are correlated with prevalence, but the strength of the relationships depends on whether variation in prevalence is driven by host- or vector-related factors. Next, we assess the responsiveness of a variety of statistics to malaria control interventions, finding that those related to mixed infections respond quickly (∼months) whereas other statistics, such as nucleotide diversity, may take decades to respond. These findings provide insights into the opportunities and challenges associated with using genetic data to monitor malaria epidemiology.


Assuntos
Variação Genética , Malária Falciparum/epidemiologia , Plasmodium falciparum/patogenicidade , Animais , Humanos , Malária Falciparum/parasitologia , Malária Falciparum/transmissão , Modelos Teóricos , Plasmodium falciparum/genética , Prevalência
12.
PLoS Genet ; 17(8): e1009723, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34437535

RESUMO

Inherited genetic variation contributes to individual risk for many complex diseases and is increasingly being used for predictive patient stratification. Previous work has shown that genetic factors are not equally relevant to human traits across age and other contexts, though the reasons for such variation are not clear. Here, we introduce methods to infer the form of the longitudinal relationship between genetic relative risk for disease and age and to test whether all genetic risk factors behave similarly. We use a proportional hazards model within an interval-based censoring methodology to estimate age-varying individual variant contributions to genetic relative risk for 24 common diseases within the British ancestry subset of UK Biobank, applying a Bayesian clustering approach to group variants by their relative risk profile over age and permutation tests for age dependency and multiplicity of profiles. We find evidence for age-varying relative risk profiles in nine diseases, including hypertension, skin cancer, atherosclerotic heart disease, hypothyroidism and calculus of gallbladder, several of which show evidence, albeit weak, for multiple distinct profiles of genetic relative risk. The predominant pattern shows genetic risk factors having the greatest relative impact on risk of early disease, with a monotonic decrease over time, at least for the majority of variants, although the magnitude and form of the decrease varies among diseases. As a consequence, for diseases where genetic relative risk decreases over age, genetic risk factors have stronger explanatory power among younger populations, compared to older ones. We show that these patterns cannot be explained by a simple model involving the presence of unobserved covariates such as environmental factors. We discuss possible models that can explain our observations and the implications for genetic risk prediction.


Assuntos
Fatores Etários , Doença/genética , Teorema de Bayes , Humanos , Modelos Estatísticos , Modelos de Riscos Proporcionais , Fatores de Risco
13.
Am J Cardiol ; 148: 157-164, 2021 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-33675770

RESUMO

The American College of Cardiology / American Heart Association pooled cohort equations tool (ASCVD-PCE) is currently recommended to assess 10-year risk for atherosclerotic cardiovascular disease (ASCVD). ASCVD-PCE does not currently include genetic risk factors. Polygenic risk scores (PRSs) have been shown to offer a powerful new approach to measuring genetic risk for common diseases, including ASCVD, and to enhance risk prediction when combined with ASCVD-PCE. Most work to date, including the assessment of tools, has focused on performance in individuals of European ancestries. Here we present evidence for the clinical validation of a new integrated risk tool (IRT), ASCVD-IRT, which combines ASCVD-PCE with PRS to predict 10-year risk of ASCVD across diverse ethnicity and ancestry groups. We demonstrate improved predictive performance of ASCVD-IRT over ASCVD-PCE, not only in individuals of self-reported White ethnicities (net reclassification improvement [NRI]; with 95% confidence interval = 2.7% [1.1 to 4.2]) but also Black / African American / Black Caribbean / Black African (NRI = 2.5% [0.6-4.3]) and South Asian (Indian, Bangladeshi or Pakistani) ethnicities (NRI = 8.7% [3.1 to 14.4]). NRI confidence intervals were wider and included zero for ethnicities with smaller sample sizes, including Hispanic (NRI = 7.5% [-1.4 to 16.5]), but PRS effect sizes in these ethnicities were significant and of comparable size to those seen in individuals of White ethnicities. Comparable results were obtained when individuals were analyzed by genetically inferred ancestry. Together, these results validate the performance of ASCVD-IRT in multiple ethnicities and ancestries, and favor their generalization to all ethnicities and ancestries.


Assuntos
Aterosclerose/epidemiologia , Predisposição Genética para Doença , Fatores de Risco de Doenças Cardíacas , Adulto , Idoso , Ásia Ocidental , Povo Asiático , Aterosclerose/etnologia , Aterosclerose/genética , População Negra , Estudos de Coortes , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Reprodutibilidade dos Testes , População Branca
14.
Genome Res ; 30(8): 1154-1169, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32817236

RESUMO

The characterization of de novo mutations in regions of high sequence and structural diversity from whole-genome sequencing data remains highly challenging. Complex structural variants tend to arise in regions of high repetitiveness and low complexity, challenging both de novo assembly, in which short reads do not capture the long-range context required for resolution, and mapping approaches, in which improper alignment of reads to a reference genome that is highly diverged from that of the sample can lead to false or partial calls. Long-read technologies can potentially solve such problems but are currently unfeasible to use at scale. Here we present Corticall, a graph-based method that combines the advantages of multiple technologies and prior data sources to detect arbitrary classes of genetic variant. We construct multisample, colored de Bruijn graphs from short-read data for all samples, align long-read-derived haplotypes and multiple reference data sources to restore graph connectivity information, and call variants using graph path-finding algorithms and a model for simultaneous alignment and recombination. We validate and evaluate the approach using extensive simulations and use it to characterize the rate and spectrum of de novo mutation events in 119 progeny from four Plasmodium falciparum experimental crosses, using long-read data on the parents to inform reconstructions of the progeny and to detect several known and novel nonallelic homologous recombination events.


Assuntos
Genoma de Protozoário/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Mutação/genética , Plasmodium falciparum/genética , Sequenciamento Completo do Genoma/métodos , Algoritmos , Sequência de Bases , Variação Genética/genética , Alinhamento de Sequência , Análise de Sequência de DNA/métodos , Software
15.
PLoS Genet ; 16(5): e1008619, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32369493

RESUMO

Coalescent simulations are widely used to examine the effects of evolution and demographic history on the genetic makeup of populations. Thanks to recent progress in algorithms and data structures, simulators such as the widely-used msprime now provide genome-wide simulations for millions of individuals. However, this software relies on classic coalescent theory and its assumptions that sample sizes are small and that the region being simulated is short. Here we show that coalescent simulations of long regions of the genome exhibit large biases in identity-by-descent (IBD), long-range linkage disequilibrium (LD), and ancestry patterns, particularly when the sample size is large. We present a Wright-Fisher extension to msprime, and show that it produces more realistic distributions of IBD, LD, and ancestry proportions, while also addressing more subtle biases of the coalescent. Further, these extensions are more computationally efficient than state-of-the-art coalescent simulations when simulating long regions, including whole-genome data. For shorter regions, efficiency can be maintained via a hybrid model which simulates the recent past under the Wright-Fisher model and uses coalescent simulations in the distant past.


Assuntos
Algoritmos , Sequência de Bases/fisiologia , Genética Populacional/métodos , Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Estudos de Coortes , Simulação por Computador , Evolução Molecular , Genoma/genética , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Humanos , Desequilíbrio de Ligação , Recombinação Genética/fisiologia , Tamanho da Amostra
16.
PLoS Biol ; 18(1): e3000586, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31951611

RESUMO

The origin and fate of new mutations within species is the fundamental process underlying evolution. However, while much attention has been focused on characterizing the presence, frequency, and phenotypic impact of genetic variation, the evolutionary histories of most variants are largely unexplored. We have developed a nonparametric approach for estimating the date of origin of genetic variants in large-scale sequencing data sets. The accuracy and robustness of the approach is demonstrated through simulation. Using data from two publicly available human genomic diversity resources, we estimated the age of more than 45 million single-nucleotide polymorphisms (SNPs) in the human genome and release the Atlas of Variant Age as a public online database. We characterize the relationship between variant age and frequency in different geographical regions and demonstrate the value of age information in interpreting variants of functional and selective importance. Finally, we use allele age estimates to power a rapid approach for inferring the ancestry shared between individual genomes and to quantify genealogical relationships at different points in the past, as well as to describe and explore the evolutionary history of modern human populations.


Assuntos
Especiação Genética , Genética Populacional/métodos , Polimorfismo de Nucleotídeo Único , Grupos Raciais/genética , Fatores Etários , Alelos , Simulação por Computador , Conjuntos de Dados como Assunto , Evolução Molecular , Frequência do Gene , Variação Genética , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Linhagem , Filogenia , Análise de Sequência de DNA , Estatística como Assunto/métodos , Fatores de Tempo
17.
Nat Genet ; 52(1): 126-134, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31873298

RESUMO

Genetic risk factors frequently affect multiple common human diseases, providing insight into shared pathophysiological pathways and opportunities for therapeutic development. However, systematic identification of genetic profiles of disease risk is limited by the availability of both comprehensive clinical data on population-scale cohorts and the lack of suitable statistical methodology that can handle the scale of and differential power inherent in multi-phenotype data. Here, we develop a disease-agnostic approach to cluster the genetic risk profiles for 3,025 genome-wide independent loci across 19,155 disease classification codes from 320,644 participants in the UK Biobank, representing a large and heterogeneous population. We identify 339 distinct disease association profiles and use multiple approaches to link clusters to the underlying biological pathways. We show how clusters can decompose the variance and covariance in risk for disease, thereby identifying underlying biological processes and their impact. We demonstrate the use of clusters in defining disease relationships and their potential in informing therapeutic strategies.


Assuntos
Bancos de Espécimes Biológicos , Doenças Genéticas Inatas/genética , Loci Gênicos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável , Adulto , Idoso , Feminino , Interação Gene-Ambiente , Humanos , Masculino , Pessoa de Meia-Idade , Fenótipo , Estudos Prospectivos , Fatores de Risco , Reino Unido
18.
Nat Genet ; 51(11): 1660, 2019 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31591513

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

19.
Nat Genet ; 51(9): 1330-1338, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31477934

RESUMO

Inferring the full genealogical history of a set of DNA sequences is a core problem in evolutionary biology, because this history encodes information about the events and forces that have influenced a species. However, current methods are limited, and the most accurate techniques are able to process no more than a hundred samples. As datasets that consist of millions of genomes are now being collected, there is a need for scalable and efficient inference methods to fully utilize these resources. Here we introduce an algorithm that is able to not only infer whole-genome histories with comparable accuracy to the state-of-the-art but also process four orders of magnitude more sequences. The approach also provides an 'evolutionary encoding' of the data, enabling efficient calculation of relevant statistics. We apply the method to human data from the 1000 Genomes Project, Simons Genome Diversity Project and UK Biobank, showing that the inferred genealogies are rich in biological signal and efficient to process.


Assuntos
Algoritmos , Evolução Molecular , Genética Populacional , Genoma Humano , Linhagem , Seleção Genética , Simulação por Computador , Conjuntos de Dados como Assunto , Haplótipos , Humanos , Modelos Genéticos , Mutação , Polimorfismo de Nucleotídeo Único , Densidade Demográfica
20.
Nat Commun ; 10(1): 3017, 2019 07 09.
Artigo em Inglês | MEDLINE | ID: mdl-31289267

RESUMO

Differences among hosts, resulting from genetic variation in the immune system or heterogeneity in drug treatment, can impact within-host pathogen evolution. Genetic association studies can potentially identify such interactions. However, extensive and correlated genetic population structure in hosts and pathogens presents a substantial risk of confounding analyses. Moreover, the multiple testing burden of interaction scanning can potentially limit power. We present a Bayesian approach for detecting host influences on pathogen evolution that exploits vast existing data sets of pathogen diversity to improve power and control for stratification. The approach models key processes, including recombination and selection, and identifies regions of the pathogen genome affected by host factors. Our simulations and empirical analysis of drug-induced selection on the HIV-1 genome show that the method recovers known associations and has superior precision-recall characteristics compared to other approaches. We build a high-resolution map of HLA-induced selection in the HIV-1 genome, identifying novel epitope-allele combinations.


Assuntos
Evolução Molecular , HIV-1/genética , Antígenos HLA/imunologia , Interações Hospedeiro-Patógeno/genética , Modelos Genéticos , Fármacos Anti-HIV/farmacologia , Fármacos Anti-HIV/uso terapêutico , Teorema de Bayes , Conjuntos de Dados como Assunto , Epitopos/efeitos dos fármacos , Epitopos/genética , Epitopos/imunologia , Genoma Viral/efeitos dos fármacos , Infecções por HIV/tratamento farmacológico , Infecções por HIV/imunologia , Infecções por HIV/virologia , HIV-1/efeitos dos fármacos , HIV-1/imunologia , Interações Hospedeiro-Patógeno/imunologia , Humanos , Recombinação Genética/efeitos dos fármacos , Recombinação Genética/imunologia , Seleção Genética/efeitos dos fármacos , Seleção Genética/imunologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...