Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Comput Biol Med ; 178: 108799, 2024 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-38925087

RESUMO

Magnetic resonance imaging (MRI) has become an essential and a frontline technique in the detection of brain tumor. However, segmenting tumors manually from scans is laborious and time-consuming. This has led to an increasing trend towards fully automated methods for precise tumor segmentation in MRI scans. Accurate tumor segmentation is crucial for improved diagnosis, treatment, and prognosis. This study benchmarks and evaluates four widely used CNN-based methods for brain tumor segmentation CaPTk, 2DVNet, EnsembleUNets, and ResNet50. Using 1251 multimodal MRI scans from the BraTS2021 dataset, we compared the performance of these methods against a reference dataset of segmented images assisted by radiologists. This comparison was conducted using segmented images directly and further by radiomic features extracted from the segmented images using pyRadiomics. Performance was assessed using the Dice Similarity Coefficient (DSC) and Hausdorff Distance (HD). EnsembleUNets excelled, achieving a DSC of 0.93 and an HD of 18, outperforming the other methods. Further comparative analysis of radiomic features confirmed EnsembleUNets as the most precise segmentation method, surpassing other methods. EnsembleUNets recorded a Concordance Correlation Coefficient (CCC) of 0.79, a Total Deviation Index (TDI) of 1.14, and a Root Mean Square Error (RMSE) of 0.53, underscoring its superior performance. We also performed validation on an independent dataset of 611 samples (UPENN-GBM), which further supported the accuracy of EnsembleUNets, with a DSC of 0.85 and an HD of 17.5. These findings provide valuable insight into the efficacy of EnsembleUNets, supporting informed decisions for accurate brain tumor segmentation.

2.
Stat Med ; 42(26): 4867-4885, 2023 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-37643728

RESUMO

Polygenicity refers to the phenomenon that multiple genetic variants have a nonzero effect on a complex trait. It is defined as the proportion of genetic variants with a nonzero effect on the trait. Evaluation of polygenicity can provide valuable insights into the genetic architecture of the trait. Several recent works have attempted to estimate polygenicity at the single nucleotide polymorphism level. However, evaluating polygenicity at the gene level can be biologically more meaningful. We propose the notion of gene-level polygenicity, defined as the proportion of genes having a nonzero effect on the trait under the framework of a transcriptome-wide association study. We introduce a Bayesian approach genepoly to estimate this quantity for a trait. The method is based on spike and slab prior and simultaneously estimates the subset of non-null genes. Our simulation study shows that genepoly efficiently estimates gene-level polygenicity. The method produces a downward bias for small choices of trait heritability due to a non-null gene, which diminishes rapidly with an increase in the genome-wide association study (GWAS) sample size. While identifying the subset of non-null genes, genepoly offers a high level of specificity and an overall good level of sensitivity-the sensitivity increases as the sample size of the reference panel expression and GWAS data increase. We applied the method to seven phenotypes in the UK Biobank, integrating expression data. We find height to be the most polygenic and asthma to be the least polygenic.

3.
J Natl Cancer Inst ; 115(6): 712-732, 2023 06 08.
Artigo em Inglês | MEDLINE | ID: mdl-36929942

RESUMO

BACKGROUND: The shared inherited genetic contribution to risk of different cancers is not fully known. In this study, we leverage results from 12 cancer genome-wide association studies (GWAS) to quantify pairwise genome-wide genetic correlations across cancers and identify novel cancer susceptibility loci. METHODS: We collected GWAS summary statistics for 12 solid cancers based on 376 759 participants with cancer and 532 864 participants without cancer of European ancestry. The included cancer types were breast, colorectal, endometrial, esophageal, glioma, head and neck, lung, melanoma, ovarian, pancreatic, prostate, and renal cancers. We conducted cross-cancer GWAS and transcriptome-wide association studies to discover novel cancer susceptibility loci. Finally, we assessed the extent of variant-specific pleiotropy among cancers at known and newly identified cancer susceptibility loci. RESULTS: We observed widespread but modest genome-wide genetic correlations across cancers. In cross-cancer GWAS and transcriptome-wide association studies, we identified 15 novel cancer susceptibility loci. Additionally, we identified multiple variants at 77 distinct loci with strong evidence of being associated with at least 2 cancer types by testing for pleiotropy at known cancer susceptibility loci. CONCLUSIONS: Overall, these results suggest that some genetic risk variants are shared among cancers, though much of cancer heritability is cancer-specific and thus tissue-specific. The increase in statistical power associated with larger sample sizes in cross-disease analysis allows for the identification of novel susceptibility regions. Future studies incorporating data on multiple cancer types are likely to identify additional regions associated with the risk of multiple cancer types.


Assuntos
Estudo de Associação Genômica Ampla , Neoplasias , Masculino , Humanos , Estudo de Associação Genômica Ampla/métodos , Predisposição Genética para Doença , Neoplasias/genética , Fatores de Risco , Transcriptoma , Polimorfismo de Nucleotídeo Único
4.
Entropy (Basel) ; 24(9)2022 Aug 25.
Artigo em Inglês | MEDLINE | ID: mdl-36141075

RESUMO

This paper considers the problem of comparing several means under the one-way Analysis of Variance (ANOVA) setup. In ANOVA, outliers and heavy-tailed error distribution can seriously hinder the treatment effect, leading to false positive or false negative test results. We propose a robust test of ANOVA using an M-estimator based on the density power divergence. Compared with the existing robust and non-robust approaches, the proposed testing procedure is less affected by data contamination and improves the analysis. The asymptotic properties of the proposed test are derived under some regularity conditions. The finite-sample performance of the proposed test is examined via a series of Monte-Carlo experiments and two empirical data examples-bone marrow transplant dataset and glucose level dataset. The results produced by the proposed testing procedure are favorably compared with the classical ANOVA and robust tests based on Huber's M-estimator and Tukey's MM-estimator.

5.
J Genet ; 1012022.
Artigo em Inglês | MEDLINE | ID: mdl-35129133

RESUMO

In genomewide association study (GWAS) of a complex phenotype, a large number of variants, many with small effect sizes, are found to contribute to the variability of the phenotype. Subsequent to the identification of such variants in a GWAS, it is of interest to estimate the risk jointly conferred by the variants. We propose three different strategies of combining the risk SNPs to calculate an allele dosage score. Using simulations, we evaluate the different measures of allele dosage score with respect to the risk prediction accuracy of a binary trait and the proportion of variance explained for a quantitative trait. For a binary trait, an allele dosage score defined based on log odds ratio performs marginally better than the other two measures. For a quantitative trait, the measure based on the standardized slope coefficient in linear regression of the trait on single-nucleotide polymorphism (SNP) genotypes performs better than the measures using the weights proportional to log P-value and the proportion of variance explained. We demonstrate the utility of these measures using a real data on type 2 diabetes and fasting blood sugar level in a south Indian population.


Assuntos
Diabetes Mellitus Tipo 2 , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Fenótipo
6.
Eur J Hum Genet ; 30(5): 547-554, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-34949768

RESUMO

In genetic studies of psychiatric disorders in the pre-genome-wide association study (GWAS) era, one of the most commonly studied loci is the serotonin transporter (SLC6A4) promoter polymorphism, a 43-base-pair insertion/deletion polymorphism in the promoter region (5-HTTLPR). The genetic association signals between 5-HTTLPR and psychiatric phenotypes, however, have been inconsistent across many studies. Since the polymorphism cannot be tested via available SNP arrays, we had previously proposed an efficient machine learning algorithm to predict the genotypes of 5-HTTLPR based on the genotypes of eight nearby SNPs, which requires access to individual-level genotype and phenotype data. To utilize the advantage of publicly available GWAS summary statistics obtained from studies with very large sample sizes, we develop a GWAS summary-statistics-based approach for testing the variable number of tandem repeat (VNTR) associations with various phenotypes. We first cross-verify the accuracy of the summary-statistics-based approach for 61 phenotypes in the UK Biobank. Since we observed a strong similarity between the predicted individual-level 5-HTTLPR genotype-based approach and the summary-statistics-based approach, we applied our method to the available neurobehavioral GWAS summary statistics data obtained from large-scale GWAS. We found no genome-wide significant evidence for association between 5-HTTLPR and any of the neurobehavioral traits. We did observe, however, genome-wide significant evidence for association between this locus and human adult height, BMI, and total cholesterol. Our summary-statistics-based approach provides a systematic way to examine the role of VNTRs and related types of genetic polymorphisms in disease risk and trait susceptibility of phenotypes for which large-scale GWAS summary statistics data are available.


Assuntos
Estudo de Associação Genômica Ampla , Proteínas da Membrana Plasmática de Transporte de Serotonina , Genótipo , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Proteínas da Membrana Plasmática de Transporte de Serotonina/genética , Sequências de Repetição em Tandem
7.
HGG Adv ; 2(3): 100041, 2021 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-34355204

RESUMO

Genome-wide association studies (GWASs) have identified thousands of cancer risk loci revealing many risk regions shared across multiple cancers. Characterizing the cross-cancer shared genetic basis can increase our understanding of global mechanisms of cancer development. In this study, we collected GWAS summary statistics based on up to 375,468 cancer cases and 530,521 controls for fourteen types of cancer, including breast (overall, estrogen receptor [ER]-positive, and ER-negative), colorectal, endometrial, esophageal, glioma, head/neck, lung, melanoma, ovarian, pancreatic, prostate, and renal cancer, to characterize the shared genetic basis of cancer risk. We identified thirteen pairs of cancers with statistically significant local genetic correlations across eight distinct genomic regions. Specifically, the 5p15.33 region, harboring the TERT and CLPTM1L genes, showed statistically significant local genetic correlations for multiple cancer pairs. We conducted a cross-cancer fine-mapping of the 5p15.33 region based on eight cancers that showed genome-wide significant associations in this region (ER-negative breast, colorectal, glioma, lung, melanoma, ovarian, pancreatic, and prostate cancer). We used an iterative analysis pipeline implementing a subset-based meta-analysis approach based on cancer-specific conditional analyses and identified ten independent cross-cancer associations within this region. For each signal, we conducted cross-cancer fine-mapping to prioritize the most plausible causal variants. Our findings provide a more in-depth understanding of the shared inherited basis across human cancers and expand our knowledge of the 5p15.33 region in carcinogenesis.

8.
PLoS Comput Biol ; 17(5): e1008915, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-34019542

RESUMO

Genetic predisposition for complex traits often acts through multiple tissues at different time points during development. As a simple example, the genetic predisposition for obesity could be manifested either through inherited variants that control metabolism through regulation of genes expressed in the brain, or that control fat storage through dysregulation of genes expressed in adipose tissue, or both. Here we describe a statistical approach that leverages tissue-specific expression quantitative trait loci (eQTLs) corresponding to tissue-specific genes to prioritize a relevant tissue underlying the genetic predisposition of a given individual for a complex trait. Unlike existing approaches that prioritize relevant tissues for the trait in the population, our approach probabilistically quantifies the tissue-wise genetic contribution to the trait for a given individual. We hypothesize that for a subgroup of individuals the genetic contribution to the trait can be mediated primarily through a specific tissue. Through simulations using the UK Biobank, we show that our approach can predict the relevant tissue accurately and can cluster individuals according to their tissue-specific genetic architecture. We analyze body mass index (BMI) and waist to hip ratio adjusted for BMI (WHRadjBMI) in the UK Biobank to identify subgroups of individuals whose genetic predisposition act primarily through brain versus adipose tissue, and adipose versus muscle tissue, respectively. Notably, we find that these individuals have specific phenotypic features beyond BMI and WHRadjBMI that distinguish them from random individuals in the data, suggesting biological effects of tissue-specific genetic contribution for these traits.


Assuntos
Herança Multifatorial , Locos de Características Quantitativas , Tecido Adiposo/metabolismo , Algoritmos , Teorema de Bayes , Índice de Massa Corporal , Encéfalo/metabolismo , Biologia Computacional , Simulação por Computador , Expressão Gênica , Predisposição Genética para Doença , Humanos , Modelos Genéticos , Obesidade/genética , Obesidade/patologia , Especificidade de Órgãos , Fenótipo , Polimorfismo de Nucleotídeo Único , Software , Distribuição Tecidual
9.
PLoS Genet ; 17(4): e1008973, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33831007

RESUMO

Transcriptome-wide association studies (TWAS) test the association between traits and genetically predicted gene expression levels. The power of a TWAS depends in part on the strength of the correlation between a genetic predictor of gene expression and the causally relevant gene expression values. Consequently, TWAS power can be low when expression quantitative trait locus (eQTL) data used to train the genetic predictors have small sample sizes, or when data from causally relevant tissues are not available. Here, we propose to address these issues by integrating multiple tissues in the TWAS using sparse canonical correlation analysis (sCCA). We show that sCCA-TWAS combined with single-tissue TWAS using an aggregate Cauchy association test (ACAT) outperforms traditional single-tissue TWAS. In empirically motivated simulations, the sCCA+ACAT approach yielded the highest power to detect a gene associated with phenotype, even when expression in the causal tissue was not directly measured, while controlling the Type I error when there is no association between gene expression and phenotype. For example, when gene expression explains 2% of the variability in outcome, and the GWAS sample size is 20,000, the average power difference between the ACAT combined test of sCCA features and single-tissue, versus single-tissue combined with Generalized Berk-Jones (GBJ) method, single-tissue combined with S-MultiXcan, UTMOST, or summarizing cross-tissue expression patterns using Principal Component Analysis (PCA) approaches was 5%, 8%, 5% and 38%, respectively. The gain in power is likely due to sCCA cross-tissue features being more likely to be detectably heritable. When applied to publicly available summary statistics from 10 complex traits, the sCCA+ACAT test was able to increase the number of testable genes and identify on average an additional 400 additional gene-trait associations that single-trait TWAS missed. Our results suggest that aggregating eQTL data across multiple tissues using sCCA can improve the sensitivity of TWAS while controlling for the false positive rate.


Assuntos
Estudo de Associação Genômica Ampla/estatística & dados numéricos , Modelos Genéticos , Análise Multivariada , Transcriptoma/genética , Simulação por Computador , Regulação da Expressão Gênica/genética , Predisposição Genética para Doença , Humanos , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética
10.
Bioinformatics ; 36(24): 5640-5648, 2021 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-33453114

RESUMO

MOTIVATION: While gene-environment (GxE) interactions contribute importantly to many different phenotypes, detecting such interactions requires well-powered studies and has proven difficult. To address this, we combine two approaches to improve GxE power: simultaneously evaluating multiple phenotypes and using a two-step analysis approach. Previous work shows that the power to identify a main genetic effect can be improved by simultaneously analyzing multiple related phenotypes. For a univariate phenotype, two-step methods produce higher power for detecting a GxE interaction compared to single step analysis. Therefore, we propose a two-step approach to test for an overall GxE effect for multiple phenotypes. RESULTS: Using simulations we demonstrate that, when more than one phenotype has GxE effect (i.e. GxE pleiotropy), our approach offers substantial gain in power (18-43%) to detect an aggregate-level GxE effect for a multivariate phenotype compared to an analogous two-step method to identify GxE effect for a univariate phenotype. We applied the proposed approach to simultaneously analyze three lipids, LDL, HDL and Triglyceride with the frequency of alcohol consumption as environmental factor in the UK Biobank. The method identified two loci with an overall GxE effect on the vector of lipids, one of which was missed by the competing approaches. AVAILABILITY AND IMPLEMENTATION: We provide an R package MPGE implementing the proposed approach which is available from CRAN: https://cran.r-project.org/web/packages/MPGE/index.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

11.
Nat Commun ; 11(1): 5504, 2020 10 30.
Artigo em Inglês | MEDLINE | ID: mdl-33127880

RESUMO

Single-cell RNA-sequencing (scRNA-Seq) is a compelling approach to directly and simultaneously measure cellular composition and state, which can otherwise only be estimated by applying deconvolution methods to bulk RNA-Seq estimates. However, it has not yet become a widely used tool in population-scale analyses, due to its prohibitively high cost. Here we show that given the same budget, the statistical power of cell-type-specific expression quantitative trait loci (eQTL) mapping can be increased through low-coverage per-cell sequencing of more samples rather than high-coverage sequencing of fewer samples. We use simulations starting from one of the largest available real single-cell RNA-Seq data from 120 individuals to also show that multiple experimental designs with different numbers of samples, cells per sample and reads per cell could have similar statistical power, and choosing an appropriate design can yield large cost savings especially when multiplexed workflows are considered. Finally, we provide a practical approach on selecting cost-effective designs for maximizing cell-type-specific eQTL power which is available in the form of a web tool.


Assuntos
Locos de Características Quantitativas/genética , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Sequência de Bases , Biologia Computacional , Expressão Gênica , Perfilação da Expressão Gênica/métodos , Genômica , Humanos
12.
Nat Genet ; 51(8): 1244-1251, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31358995

RESUMO

SNP-heritability is a fundamental quantity in the study of complex traits. Recent studies have shown that existing methods to estimate genome-wide SNP-heritability can yield biases when their assumptions are violated. While various approaches have been proposed to account for frequency- and linkage disequilibrium (LD)-dependent genetic architectures, it remains unclear which estimates reported in the literature are reliable. Here we show that genome-wide SNP-heritability can be accurately estimated from biobank-scale data irrespective of genetic architecture, without specifying a heritability model or partitioning SNPs by allele frequency and/or LD. We show analytically and through extensive simulations starting from real genotypes (UK Biobank, N = 337 K) that, unlike existing methods, our closed-form estimator is robust across a wide range of architectures. We provide estimates of SNP-heritability for 22 complex traits in the UK Biobank and show that, consistent with our results in simulations, existing biobank-scale methods yield estimates up to 30% different from our theoretically-justified approach.


Assuntos
Bancos de Espécimes Biológicos/estatística & dados numéricos , Genoma Humano , Desequilíbrio de Ligação , Modelos Teóricos , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável , Estudo de Associação Genômica Ampla , Humanos , Fenótipo
13.
PLoS Genet ; 14(2): e1007139, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29432419

RESUMO

Simultaneous analysis of genetic associations with multiple phenotypes may reveal shared genetic susceptibility across traits (pleiotropy). For a locus exhibiting overall pleiotropy, it is important to identify which specific traits underlie this association. We propose a Bayesian meta-analysis approach (termed CPBayes) that uses summary-level data across multiple phenotypes to simultaneously measure the evidence of aggregate-level pleiotropic association and estimate an optimal subset of traits associated with the risk locus. This method uses a unified Bayesian statistical framework based on a spike and slab prior. CPBayes performs a fully Bayesian analysis by employing the Markov Chain Monte Carlo (MCMC) technique Gibbs sampling. It takes into account heterogeneity in the size and direction of the genetic effects across traits. It can be applied to both cohort data and separate studies of multiple traits having overlapping or non-overlapping subjects. Simulations show that CPBayes can produce higher accuracy in the selection of associated traits underlying a pleiotropic signal than the subset-based meta-analysis ASSET. We used CPBayes to undertake a genome-wide pleiotropic association study of 22 traits in the large Kaiser GERA cohort and detected six independent pleiotropic loci associated with at least two phenotypes. This includes a locus at chromosomal region 1q24.2 which exhibits an association simultaneously with the risk of five different diseases: Dermatophytosis, Hemorrhoids, Iron Deficiency, Osteoporosis and Peripheral Vascular Disease. We provide an R-package 'CPBayes' implementing the proposed method.


Assuntos
Teorema de Bayes , Estudos de Associação Genética/métodos , Estudos de Associação Genética/estatística & dados numéricos , Predisposição Genética para Doença , Fenótipo , Estudos de Casos e Controles , Estudos de Coortes , Predisposição Genética para Doença/epidemiologia , Humanos , Cadeias de Markov , Método de Monte Carlo
14.
PLoS Genet ; 13(3): e1006690, 2017 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-28362817

RESUMO

Breast cancer is the most common solid organ malignancy and the most frequent cause of cancer death among women worldwide. Previous research has yielded insights into its genetic etiology, but there remains a gap in the understanding of genetic factors that contribute to risk, and particularly in the biological mechanisms by which genetic variation modulates risk. The National Cancer Institute's "Up for a Challenge" (U4C) competition provided an opportunity to further elucidate the genetic basis of the disease. Our group leveraged the seven datasets made available by the U4C organizers and data from the publicly available UK Biobank cohort to examine associations between imputed gene expression and breast cancer risk. In particular, we used reference datasets describing the breast tissue and whole blood transcriptomes to impute expression levels in breast cancer cases and controls. In trans-ethnic meta-analyses of U4C and UK Biobank data, we found significant associations between breast cancer risk and the expression of RCCD1 (joint p-value: 3.6x10-06) and DHODH (p-value: 7.1x10-06) in breast tissue, as well as a suggestive association for ANKLE1 (p-value: 9.3x10-05). Expression of RCCD1 in whole blood was also suggestively associated with disease risk (p-value: 1.2x10-05), as were expression of ACAP1 (p-value: 1.9x10-05) and LRRC25 (p-value: 5.2x10-05). While genome-wide association studies (GWAS) have implicated RCCD1 and ANKLE1 in breast cancer risk, they have not identified the remaining three genes. Among the genetic variants that contributed to the predicted expression of the five genes, we found 23 nominally (p-value < 0.05) associated with breast cancer risk, among which 15 are not in high linkage disequilibrium with risk variants previously identified by GWAS. In summary, we used a transcriptome-based approach to investigate the genetic underpinnings of breast carcinogenesis. This approach provided an avenue for deciphering the functional relevance of genes and genetic variants involved in breast cancer.


Assuntos
Neoplasias da Mama/genética , Proteínas de Transporte/genética , Proteínas Ativadoras de GTPase/genética , Predisposição Genética para Doença , Proteínas de Membrana/genética , Locos de Características Quantitativas/genética , Mama/metabolismo , Mama/patologia , Neoplasias da Mama/sangue , Neoplasias da Mama/patologia , Proteínas de Transporte/sangue , Endonucleases/sangue , Endonucleases/genética , Etnicidade , Feminino , Proteínas Ativadoras de GTPase/sangue , Estudo de Associação Genômica Ampla , Humanos , Proteínas de Membrana/sangue , Polimorfismo de Nucleotídeo Único , Fatores de Risco , Transcriptoma/genética
15.
Genet Epidemiol ; 40(5): 366-81, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27238845

RESUMO

Discovering pleiotropic loci is important to understand the biological basis of seemingly distinct phenotypes. Most methods for assessing pleiotropy only test for the overall association between genetic variants and multiple phenotypes. To determine which specific traits are pleiotropic, we evaluate via simulation and application three different strategies. The first is model selection techniques based on the inverse regression of genotype on phenotypes. The second is a subset-based meta analysis ASSET [Bhattacharjee et al., ], which provides an optimal subset of nonnull traits. And the third is a modified Benjamini-Hochberg (B-H) procedure of controlling the expected false discovery rate [Benjamini and Hochberg, ] in the framework of phenome-wide association study. From our simulations we see that an inverse regression-based approach MultiPhen [O'Reilly et al., ] is more powerful than ASSET for detecting overall pleiotropic association, except for when all the phenotypes are associated and have genetic effects in the same direction. For determining which specific traits are pleiotropic, the modified B-H procedure performs consistently better than the other two methods. The inverse regression-based selection methods perform competitively with the modified B-H procedure only when the phenotypes are weakly correlated. The efficiency of ASSET is observed to lie below and in between the efficiency of the other two methods when the traits are weakly and strongly correlated, respectively. In our application to a large GWAS, we find that the modified B-H procedure also performs well, indicating that this may be an optimal approach for determining the traits underlying a pleiotropic signal.


Assuntos
Pleiotropia Genética , Adulto , Envelhecimento/genética , Estudos de Coortes , Simulação por Computador , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Característica Quantitativa Herdável , Software
16.
Genet Epidemiol ; 39(8): 635-50, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26493781

RESUMO

Binary phenotypes commonly arise due to multiple underlying quantitative precursors and genetic variants may impact multiple traits in a pleiotropic manner. Hence, simultaneously analyzing such correlated traits may be more powerful than analyzing individual traits. Various genotype-level methods, e.g., MultiPhen (O'Reilly et al. []), have been developed to identify genetic factors underlying a multivariate phenotype. For univariate phenotypes, the usefulness and applicability of allele-level tests have been investigated. The test of allele frequency difference among cases and controls is commonly used for mapping case-control association. However, allelic methods for multivariate association mapping have not been studied much. In this article, we explore two allelic tests of multivariate association: one using a Binomial regression model based on inverted regression of genotype on phenotype (Binomial regression-based Association of Multivariate Phenotypes [BAMP]), and the other employing the Mahalanobis distance between two sample means of the multivariate phenotype vector for two alleles at a single-nucleotide polymorphism (Distance-based Association of Multivariate Phenotypes [DAMP]). These methods can incorporate both discrete and continuous phenotypes. Some theoretical properties for BAMP are studied. Using simulations, the power of the methods for detecting multivariate association is compared with the genotype-level test MultiPhen's. The allelic tests yield marginally higher power than MultiPhen for multivariate phenotypes. For one/two binary traits under recessive mode of inheritance, allelic tests are found to be substantially more powerful. All three tests are applied to two different real data and the results offer some support for the simulation study. We propose a hybrid approach for testing multivariate association that implements MultiPhen when Hardy-Weinberg Equilibrium (HWE) is violated and BAMP otherwise, because the allelic approaches assume HWE.


Assuntos
Simulação por Computador/estatística & dados numéricos , Frequência do Gene/genética , Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Análise de Regressão , Alelos , Distribuição Binomial , Marcadores Genéticos/genética , Variação Genética/genética , Genótipo , Humanos , Modelos Estatísticos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
17.
BMC Proc ; 8(Suppl 1 Genetic Analysis Workshop 18Vanessa Olmo): S74, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25519402

RESUMO

Heritable quantitative characters underline complex genetic traits. However, a single quantitative phenotype may not be a suitably good surrogate for a clinical end point trait. It may be more optimal to use a multivariate phenotype vector correlated with the end point trait to carry out an association analysis. Existing methods, such as variance components and principal components, suffer from inherent limitations, such as lack of robustness or difficulty in biological interpretation of association findings. In an effort to circumvent these limitations, we propose a novel regression approach based on a conditional binomial model to detect association between a single-nucleotide polymorphism and a multivariate phenotype vector. We use our proposed method to analyze data on systolic and diastolic blood pressure levels provided in Genetic Analysis Workshop 18. We find that the bivariate analysis of the two phenotypes yields more promising results in terms of lower p-values compared to univariate analyses.

18.
Biometrics ; 69(1): 164-73, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23432096

RESUMO

While the population-based case-control approach is the popular study design for association mapping of complex genetic traits because of ease of data collection and statistical analyses, it suffers from the inherent problem of population stratification. There have been methodological developments for adjusting these studies for population substructure, but efficient estimation of the number of subpopulations (K), which has evolutionary significance, remains a statistical challenge. In this article, we propose a Bayesian semiparametric approach to estimate population substructure under the assumption that K is random. Using extensive simulations, we find that our proposed method is not only computationally much faster than an existing Bayesian approach Structure, but also estimates the number of subpopulations more accurately, and thus, yields more power in detecting association in case-control studies.


Assuntos
Teorema de Bayes , Estudos de Casos e Controles , Genética Populacional/métodos , Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Algoritmos , Análise por Conglomerados , Simulação por Computador , Variação Genética , Humanos , Polimorfismo de Nucleotídeo Único
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...