Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 71
Filter
1.
J Cereb Blood Flow Metab ; 43(12): 2130-2143, 2023 12.
Article in English | MEDLINE | ID: mdl-37694957

ABSTRACT

Few have characterized miRNA expression during the transition from injury to neural repair and secondary neurodegeneration following stroke in humans. We compared expression of 754 miRNAs from plasma samples collected 5, 15, and 30 days post-ischemic stroke from a discovery cohort (n = 55) and 15-days post-ischemic stroke from a validation cohort (n = 48) to healthy control samples (n = 55 and 48 respectively) matched for age, sex, race and cardiovascular comorbidities using qRT-PCR. Eight miRNAs remained significantly altered across all time points in both cohorts including many described in acute stroke. The number of significantly dysregulated miRNAs more than doubled from post-stroke day 5 (19 miRNAs) to days 15 (50 miRNAs) and 30 (57 miRNAs). Twelve brain-enriched miRNAs were significantly altered at one or more time points (decreased expression, stroke versus controls: miR-107; increased expression: miR-99-5p, miR-127-3p, miR-128-3p, miR-181a-3p, miR-181a-5p, miR-382-5p, miR-433-3p, miR-491-5p, miR-495-3p, miR-874-3p, and miR-941). Many brain-enriched miRNAs were associated with apoptosis over the first month post-stroke whereas other miRNAs suggested a transition to synapse regulation and neuronal protection by day 30. These findings suggest that a program of decreased cellular proliferation may last at least 30 days post-stroke, and points to specific miRNAs that could contribute to neural repair in humans.


Subject(s)
Ischemic Stroke , MicroRNAs , Stroke , Humans , MicroRNAs/metabolism , Stroke/genetics , Brain/metabolism , Case-Control Studies , Gene Expression Profiling
2.
Genet Epidemiol ; 47(6): 409-431, 2023 09.
Article in English | MEDLINE | ID: mdl-37101379

ABSTRACT

In genetic studies, many phenotypes have multiple naturally ordered discrete values. The phenotypes can be correlated with each other. If multiple correlated ordinal traits are analyzed simultaneously, the power of analysis may increase significantly while the false positives can be controlled well. In this study, we propose bivariate functional ordinal linear regression (BFOLR) models using latent regressions with cumulative logit link or probit link to perform a gene-based analysis for bivariate ordinal traits and sequencing data. In the proposed BFOLR models, genetic variant data are viewed as stochastic functions of physical positions, and the genetic effects are treated as a function of physical positions. The BFOLR models take the correlation of the two ordinal traits into account via latent variables. The BFOLR models are built upon functional data analysis which can be revised to analyze the bivariate ordinal traits and high-dimension genetic data. The methods are flexible and can analyze three types of genetic data: (1) rare variants only, (2) common variants only, and (3) a combination of rare and common variants. Extensive simulation studies show that the likelihood ratio tests of the BFOLR models control type I errors well and have good power performance. The BFOLR models are applied to analyze Age-Related Eye Disease Study data, in which two genes, CFH and ARMS2, are found to strongly associate with eye drusen size, drusen area, age-related macular degeneration (AMD) categories, and AMD severity scale.


Subject(s)
Macular Degeneration , Models, Genetic , Humans , Phenotype , Macular Degeneration/genetics , Computer Simulation , Linear Models
3.
Genet Epidemiol ; 46(8): 615-628, 2022 12.
Article in English | MEDLINE | ID: mdl-35788983

ABSTRACT

To understand phenotypic variations and key factors which affect disease susceptibility of complex traits, it is important to decipher cell-type tissue compositions. To study cellular compositions of bulk tissue samples, one can evaluate cellular abundances and cell-type-specific gene expression patterns from the tissue transcriptome profiles. We develop both fixed and mixed models to reconstruct cellular expression fractions for bulk-profiled samples by using reference single-cell (sc) RNA-sequencing (RNA-seq) reference data. In benchmark evaluations of estimating cellular expression fractions, the mixed-effect models provide similar results as an elegant machine learning algorithm named cell-type identification by estimating relative subsets of RNA transcripts (CIBERSORTx), which is a well-known and reliable procedure to reconstruct cell-type abundances and cell-type-specific gene expression profiles. In real data analysis, the mixed-effect models outperform or perform similarly as CIBERSORTx. The mixed models perform better than the fixed models in both benchmark evaluations and data analysis. In simulation studies, we show that if the heterogeneity exists in scRNA-seq data, it is better to use mixed models with heterogeneous mean and variance-covariance. As a byproduct, the mixed models provide fractions of covariance between subject-specific gene expression and cell types to measure their correlations. The proposed mixed models provide a complementary tool to dissect bulk tissues using scRNA-seq data.


Subject(s)
Gene Expression Profiling , Single-Cell Analysis , Humans , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Gene Expression Profiling/methods , Models, Genetic , Transcriptome , RNA
4.
Genet Epidemiol ; 46(5-6): 234-255, 2022 07.
Article in English | MEDLINE | ID: mdl-35438198

ABSTRACT

In this paper, we develop functional ordinal logistic regression (FOLR) models to perform gene-based analysis of ordinal traits. In the proposed FOLR models, genetic variant data are viewed as stochastic functions of physical positions and the genetic effects are treated as a function of physical positions. The FOLR models are built upon functional data analysis which can be revised to analyze the ordinal traits and high dimension genetic data. The proposed methods are capable of dealing with dense genotype data which is usually encountered in analyzing the next-generation sequencing data. The methods are flexible and can analyze three types of genetic data: (1) rare variants only, (2) common variants only, and (3) a combination of rare and common variants. Simulation studies show that the likelihood ratio test statistics of the FOLR models control type I errors well and have good power performance. The proposed methods achieve the goals of analyzing ordinal traits directly, reducing high dimensionality of dense genetic variants, being computationally manageable, facilitating model convergence, properly controlling type I errors, and maintaining high power levels. The FOLR models are applied to analyze Age-Related Eye Disease Study data, in which two genes are found to strongly associate with four ordinal traits.


Subject(s)
Genetic Testing , Models, Genetic , Computer Simulation , Genetic Variation , Genotype , Humans , Logistic Models , Phenotype
5.
J Am Stat Assoc ; 116(534): 531-545, 2021.
Article in English | MEDLINE | ID: mdl-34321704

ABSTRACT

Genetics plays a role in age-related macular degeneration (AMD), a common cause of blindness in the elderly. There is a need for powerful methods for carrying out region-based association tests between a dichotomous trait like AMD and genetic variants on family data. Here, we apply our new generalized functional linear mixed models (GFLMM) developed to test for gene-based association in a set of AMD families. Using common and rare variants, we observe significant association with two known AMD genes: CFH and ARMS2. Using rare variants, we find suggestive signals in four genes: ASAH1, CLEC6A, TMEM63C, and SGSM1. Intriguingly, ASAH1 is down-regulated in AMD aqueous humor, and ASAH1 deficiency leads to retinal inflammation and increased vulnerability to oxidative stress. These findings were made possible by our GFLMM which model the effect of a major gene as a fixed mean, the polygenic contributions as a random variation, and the correlation of pedigree members by kinship coefficients. Simulations indicate that the GFLMM likelihood ratio tests (LRTs) accurately control the Type I error rates. The LRTs have similar or higher power than existing retrospective kernel and burden statistics. Our GFLMM-based statistics provide a new tool for conducting family-based genetic studies of complex diseases. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.

6.
Genet Epidemiol ; 45(5): 455-470, 2021 07.
Article in English | MEDLINE | ID: mdl-33645812

ABSTRACT

Genetic studies of two related survival outcomes of a pleiotropic gene are commonly encountered but statistical models to analyze them are rarely developed. To analyze sequencing data, we propose mixed effect Cox proportional hazard models by functional regressions to perform gene-based joint association analysis of two survival traits motivated by our ongoing real studies. These models extend fixed effect Cox models of univariate survival traits by incorporating variations and correlation of multivariate survival traits into the models. The associations between genetic variants and two survival traits are tested by likelihood ratio test statistics. Extensive simulation studies suggest that type I error rates are well controlled and power performances are stable. The proposed models are applied to analyze bivariate survival traits of left and right eyes in the age-related macular degeneration progression.


Subject(s)
Eye Diseases , Genetic Variation , Eye Diseases/genetics , Genetic Association Studies , Humans , Models, Genetic , Phenotype
7.
Cell Death Differ ; 27(7): 2263-2279, 2020 07.
Article in English | MEDLINE | ID: mdl-32034314

ABSTRACT

The pathogenesis of thymic epithelial tumors (TETs) is poorly understood. Recently we reported the frequent occurrence of a missense mutation in the GTF2I gene in TETs and hypothesized that GTF2I mutation might contribute to thymic tumorigenesis. Expression of mutant TFII-I altered the transcriptome of normal thymic epithelial cells and upregulated several oncogenic genes. Gtf2i L424H knockin cells exhibited cell transformation, aneuploidy, and increase tumor growth and survival under glucose deprivation or DNA damage. Gtf2i mutation also increased the expression of several glycolytic enzymes, cyclooxygenase-2, and caused modifications of lipid metabolism. Elevated cyclooxygenase-2 expression by Gtf2i mutation was required for survival under metabolic stress and cellular transformation of thymic epithelial cells. Our findings identify GTF2I mutation as a new oncogenic driver that is responsible for transformation of thymic epithelial cells.


Subject(s)
Cell Transformation, Neoplastic/genetics , Epithelial Cells/metabolism , Epithelial Cells/pathology , Mutation/genetics , Thymus Gland/pathology , Transcription Factors, TFII/genetics , Animals , Base Sequence , Carcinogenesis/genetics , Carcinogenesis/pathology , Cell Line , Cell Survival , Cell Transformation, Neoplastic/pathology , Cyclooxygenase 2/metabolism , DNA Damage/genetics , Epithelial-Mesenchymal Transition/genetics , Gene Expression Profiling , Gene Knock-In Techniques , Glucose/deficiency , Glycolysis , Humans , Lipids/biosynthesis , Mice , NIH 3T3 Cells , Transcription Factors, TFII/metabolism
9.
Genet Epidemiol ; 43(8): 952-965, 2019 12.
Article in English | MEDLINE | ID: mdl-31502722

ABSTRACT

The importance to integrate survival analysis into genetics and genomics is widely recognized, but only a small number of statisticians have produced relevant work toward this study direction. For unrelated population data, functional regression (FR) models have been developed to test for association between a quantitative/dichotomous/survival trait and genetic variants in a gene region. In major gene association analysis, these models have higher power than sequence kernel association tests. In this paper, we extend this approach to analyze censored traits for family data or related samples using FR based mixed effect Cox models (FamCoxME). The FamCoxME model effect of major gene as fixed mean via functional data analysis techniques, the local gene or polygene variations or both as random, and the correlation of pedigree members by kinship coefficients or genetic relationship matrix or both. The association between the censored trait and the major gene is tested by likelihood ratio tests (FamCoxME FR LRT). Simulation results indicate that the LRT control the type I error rates accurately/conservatively and have good power levels when both local gene or polygene variations are modeled. The proposed methods were applied to analyze a breast cancer data set from the Consortium of Investigators of Modifiers of BRCA1 and BRCA2 (CIMBA). The FamCoxME provides a new tool for gene-based analysis of family-based studies or related samples.


Subject(s)
Genetic Association Studies , Models, Genetic , Survival Analysis , Computer Simulation , Genetic Variation , Humans , Pedigree , Phenotype , Proportional Hazards Models , Regression Analysis
10.
Ophthalmology ; 126(11): 1541-1548, 2019 11.
Article in English | MEDLINE | ID: mdl-31358387

ABSTRACT

PURPOSE: To assess whether genotypes at 2 major loci associated with age-related macular degeneration (AMD), complement factor H (CFH), or age-related maculopathy susceptibility 2 (ARMS2), modify the response to oral nutrients for the treatment of AMD in the Age-Related Eye Disease Study 2 (AREDS2). DESIGN: Post hoc analysis of a randomized trial. PARTICIPANTS: White AREDS2 participants. METHODS: AREDS2 participants (n = 4203) with bilateral large drusen or late AMD in 1 eye were assigned randomly to lutein and zeaxanthin, omega-3 fatty acids, both, or placebo, and most also received the AREDS supplements. A secondary randomization assessed modified AREDS supplements in 4 treatment arms: lower zinc dosage, omission of ß-carotene, both, or no modification. To evaluate the progression to late AMD, fundus photographs were obtained at baseline and annual study visits, and history of treatment for late AMD was obtained at study visits and 6-month interim telephone calls. Participants were genotyped for the single-nucleotide polymorphisms rs1061170 in CFH and rs10490924 in ARMS2. Bivariate frailty models using both eyes were conducted, including a gene-supplement interaction term and adjusting for age, gender, level of education, and smoking status. The main treatment effects, as well as the direct comparison between lutein plus zeaxanthin and ß-carotene, were assessed for genotype interaction. MAIN OUTCOME MEASURES: The interaction between genotype and the response to AREDS2 supplements regarding progression to late AMD, any geographic atrophy (GA), and neovascular AMD. RESULTS: Complete data were available for 2775 eyes without baseline late AMD (1684 participants). The participants (mean age ± standard deviation, 72.1±7.7 years; 58.5% female) were followed up for a median of 5 years. The ARMS2 risk allele was associated significantly with progression to late AMD and neovascular AMD (P = 2.40 × 10-5 and P = 0.002, respectively), but not any GA (P = 0.097). The CFH risk allele was not associated with AMD progression. Genotype did not modify significantly the response to any of the AREDS2 supplements. CONCLUSIONS: CFH and ARMS2 risk alleles do not modify the response to the AREDS2 nutrient supplements with respect to the progression to late AMD (GA and neovascular AMD).


Subject(s)
Carotenoids/administration & dosage , Fatty Acids, Omega-3/administration & dosage , Macular Degeneration/drug therapy , Macular Degeneration/genetics , Proteins/genetics , Zinc Compounds/administration & dosage , Aged , Aged, 80 and over , Complement Factor H/genetics , Dietary Supplements , Disease Progression , Double-Blind Method , Drug Combinations , Female , Genetic Association Studies , Genome-Wide Association Study , Genotyping Techniques , Humans , Lutein/administration & dosage , Macular Degeneration/diagnosis , Male , Polymerase Chain Reaction , Polymorphism, Single Nucleotide , Visual Acuity/physiology , Zeaxanthins/administration & dosage , beta Carotene/administration & dosage
11.
Genet Epidemiol ; 43(2): 189-206, 2019 Mar.
Article in English | MEDLINE | ID: mdl-30537345

ABSTRACT

We develop linear mixed models (LMMs) and functional linear mixed models (FLMMs) for gene-based tests of association between a quantitative trait and genetic variants on pedigrees. The effects of a major gene are modeled as a fixed effect, the contributions of polygenes are modeled as a random effect, and the correlations of pedigree members are modeled via inbreeding/kinship coefficients. F -statistics and χ 2 likelihood ratio test (LRT) statistics based on the LMMs and FLMMs are constructed to test for association. We show empirically that the F -distributed statistics provide a good control of the type I error rate. The F -test statistics of the LMMs have similar or higher power than the FLMMs, kernel-based famSKAT (family-based sequence kernel association test), and burden test famBT (family-based burden test). The F -statistics of the FLMMs perform well when analyzing a combination of rare and common variants. For small samples, the LRT statistics of the FLMMs control the type I error rate well at the nominal levels α = 0.01 and 0.05 . For moderate/large samples, the LRT statistics of the FLMMs control the type I error rates well. The LRT statistics of the LMMs can lead to inflated type I error rates. The proposed models are useful in whole genome and whole exome association studies of complex traits.


Subject(s)
Genetic Association Studies , High-Throughput Nucleotide Sequencing/methods , Models, Genetic , Quantitative Trait, Heritable , Computer Simulation , Family , Humans , Linear Models , Myopia/genetics
12.
Am J Clin Nutr ; 108(6): 1334-1341, 2018 12 01.
Article in English | MEDLINE | ID: mdl-30339177

ABSTRACT

Background: Genetic polymorphisms can explain some of the population- and individual-based variations in nutritional status biomarkers. Objective: We sought to screen the entire human genome for common genetic polymorphisms that influence folate-status biomarkers in healthy individuals. Design: We carried out candidate gene analyses and genome-wide association scans in 2232 young, healthy Irish subjects to evaluate which common genetic polymorphisms influence red blood cell folate, serum folate, and plasma total homocysteine. Results: The 5,10-methylenetetrahydrofolate reductase (MTHFR) 677C→T (rs1801133) variant was the major genetic modifier of all 3 folate-related biomarkers in this Irish population and reached genome-wide significance for red blood cell folate (P = 1.37 × 10-17), serum folate (P = 2.82 × 10-11), and plasma total homocysteine (P = 1.26 × 10-19) concentrations. A second polymorphism in the MTHFR gene (rs3753584, P = 1.09 × 10-11) was the only additional MTHFR variant to exhibit any significant independent effect on red blood cell folate. Other MTHFR variants, including the 1298A→C variant (rs1801131), appeared to reach genome-wide significance, but these variants shared linkage disequilibrium with MTHFR 677C→T and were not significant when analyzed in MTHFR 677CC homozygotes. No additional non-MTHFR modifiers of red blood cell or plasma folate were detected. Two additional genome-wide significant modifiers of plasma homocysteine were found in the region of the dipeptidase 1 (DPEP1) gene on chromosome 16 and the Twist neighbor B (TWISTNB) gene on chromosome 7. Conclusions: The MTHFR 677C→T variant is the predominant genetic modifier of folate status biomarkers in this healthy Irish population. It is not necessary to determine MTHFR 677C→T genotype to evaluate folate status because its effect is reflected in concentrations of standard folate biomarkers. The MTHFR 1298A→C variant had no independent effect on folate status biomarkers. To our knowledge, this is the first genome-wide association study report on red blood cell folate and the first report of an association between homocysteine and TWISTNB.


Subject(s)
Biomarkers/blood , Folic Acid/blood , Methylenetetrahydrofolate Reductase (NADPH2)/genetics , Nutritional Status/genetics , Polymorphism, Single Nucleotide/genetics , Adolescent , Adult , Erythrocytes/chemistry , Female , Genome-Wide Association Study , Genotype , Homocysteine/blood , Humans , Ireland , Linkage Disequilibrium , Male , Young Adult
13.
Am J Clin Nutr ; 107(3): 345-354, 2018 03 01.
Article in English | MEDLINE | ID: mdl-29566195

ABSTRACT

Background: Formate is an important metabolite that serves as a donor of one-carbon groups to the intracellular tetrahydrofolate pool. However, little is known of its circulating concentrations or of their determinants. Objective: This study aimed to define formate concentrations and their determinants in a healthy young population. Design: Serum formate was measured in 1701 participants from the Trinity Student Study. The participants were men and women, aged 18 to 28 y, enrolled at Trinity College, Dublin. Formate concentrations were compared with other one-carbon metabolites, vitamin status, potential formate precursors, genetic polymorphisms, and lifestyle factors. Results: Serum formate concentrations ranged from 8.7 to 96.5 µM, with a mean of 25.9 µM. Formate concentrations were significantly higher in women than in men; oral contraceptive use did not further affect them. There was no effect of smoking or of alcohol ingestion, but the TT genotype of the methylenetetrahydrofolate reductase (MTHFR) 677C→T (rs1801133) polymorphism was associated with a significantly decreased formate concentration. Formate was positively associated with potential metabolic precursors (serine, methionine, tryptophan, choline) but not with glycine. Formate concentrations were positively related to serum folate and negatively related to serum vitamin B-12. Conclusions: Formate concentrations were sensitive to the concentrations of metabolic precursors. In view of the increased susceptibility of women with the TT genotype of MTHFR to give birth to infants with neural tube defects as well as the effectiveness of formate supplementation in decreasing the incidence of folate-resistant neural tube defects in susceptible mice, it will be important to understand how this genotype decreases the serum formate concentration. This trial was registered at www.clinicaltrials.gov as NCT03305900.


Subject(s)
Formates/blood , Life Style , Methylenetetrahydrofolate Reductase (NADPH2)/genetics , Adolescent , Adult , Choline/blood , Cross-Sectional Studies , Female , Genotyping Techniques , Humans , Incidence , Male , Methionine/blood , Polymorphism, Single Nucleotide , Serine/blood , Tryptophan/blood , Young Adult
14.
Hum Mol Genet ; 27(5): 929-940, 2018 03 01.
Article in English | MEDLINE | ID: mdl-29346644

ABSTRACT

Family- and population-based genetic studies have successfully identified multiple disease-susceptibility loci for Age-related macular degeneration (AMD), one of the first batch and most successful examples of genome-wide association study. However, most genetic studies to date have focused on case-control studies of late AMD (choroidal neovascularization or geographic atrophy). The genetic influences on disease progression are largely unexplored. We assembled unique resources to perform a genome-wide bivariate time-to-event analysis to test for association of time-to-late-AMD with ∼9 million variants on 2721 Caucasians from a large multi-center randomized clinical trial, the Age-Related Eye Disease Study. To our knowledge, this is the first genome-wide association study of disease progression (bivariate survival outcome) in AMD genetic studies, thus providing novel insights to AMD genetics. We used a robust Cox proportional hazards model to appropriately account for between-eye correlation when analyzing the progression time in the two eyes of each participant. We identified four previously reported susceptibility loci showing genome-wide significant association with AMD progression: ARMS2-HTRA1 (P = 8.1 × 10-43), CFH (P = 3.5 × 10-37), C2-CFB-SKIV2L (P = 8.1 × 10-10) and C3 (P = 1.2 × 10-9). Furthermore, we detected association of rs58978565 near TNR (P = 2.3 × 10-8), rs28368872 near ATF7IP2 (P = 2.9 × 10-8) and rs142450006 near MMP9 (P = 0.0006) with progression to choroidal neovascularization but not geographic atrophy. Secondary analysis limited to 34 reported risk variants revealed that LIPC and CTRB2-CTRB1 were also associated with AMD progression (P < 0.0015). Our genome-wide analysis thus expands the genetics in both development and progression of AMD and should assist in early identification of high risk individuals.


Subject(s)
Genome-Wide Association Study/methods , Macular Degeneration/genetics , Aged , Aged, 80 and over , Carrier Proteins/genetics , Disease Progression , Ether-A-Go-Go Potassium Channels/genetics , Female , Humans , Macular Degeneration/etiology , Male , Membrane Glycoproteins/genetics , Middle Aged , Polymorphism, Single Nucleotide , Proportional Hazards Models
15.
Eur J Med Genet ; 61(3): 145-151, 2018 Mar.
Article in English | MEDLINE | ID: mdl-29174092

ABSTRACT

Prune belly syndrome (PBS), also known as Eagle-Barrett syndrome, is a rare congenital disorder characterized by absence or hypoplasia of the abdominal wall musculature, urinary tract anomalies, and cryptorchidism in males. The etiology of PBS is largely unresolved, but genetic factors are implicated given its recurrence in families. We examined cases of PBS to identify novel pathogenic copy number variants (CNVs). A total of 34 cases (30 males and 4 females) with PBS identified from all live births in New York State (1998-2005) were genotyped using Illumina HumanOmni2.5 microarrays. CNVs were prioritized if they were absent from in-house controls, encompassed ≥10 consecutive probes, were ≥20 Kb in size, had ≤20% overlap with common variants in population reference controls, and had ≤20% overlap with any variant previously detected in other birth defect phenotypes screened in our laboratory. We identified 17 candidate autosomal CNVs; 10 cases each had one CNV and four cases each had two CNVs. The CNVs included a 158 Kb duplication at 4q22 that overlaps the BMPR1B gene; duplications of different sizes carried by two cases in the intron of STIM1 gene; a 67 Kb duplication 202 Kb downstream of the NOG gene, and a 1.34 Mb deletion including the MYOCD gene. The identified rare CNVs spanned genes involved in mesodermal, muscle, and urinary tract development and differentiation, which might help in elucidating the genetic contribution to PBS. We did not have parental DNA and cannot identify whether these CNVs were de novo or inherited. Further research on these CNVs, particularly BMP signaling is warranted to elucidate the pathogenesis of PBS.


Subject(s)
DNA Copy Number Variations , Prune Belly Syndrome/genetics , Sequence Analysis, DNA/methods , Adult , Female , Genotype , Humans , Infant, Newborn , Male , Phenotype , Young Adult
16.
PLoS Comput Biol ; 13(10): e1005788, 2017 Oct.
Article in English | MEDLINE | ID: mdl-29040274

ABSTRACT

Investigating the pleiotropic effects of genetic variants can increase statistical power, provide important information to achieve deep understanding of the complex genetic structures of disease, and offer powerful tools for designing effective treatments with fewer side effects. However, the current multiple phenotype association analysis paradigm lacks breadth (number of phenotypes and genetic variants jointly analyzed at the same time) and depth (hierarchical structure of phenotype and genotypes). A key issue for high dimensional pleiotropic analysis is to effectively extract informative internal representation and features from high dimensional genotype and phenotype data. To explore correlation information of genetic variants, effectively reduce data dimensions, and overcome critical barriers in advancing the development of novel statistical methods and computational algorithms for genetic pleiotropic analysis, we proposed a new statistic method referred to as a quadratically regularized functional CCA (QRFCCA) for association analysis which combines three approaches: (1) quadratically regularized matrix factorization, (2) functional data analysis and (3) canonical correlation analysis (CCA). Large-scale simulations show that the QRFCCA has a much higher power than that of the ten competing statistics while retaining the appropriate type 1 errors. To further evaluate performance, the QRFCCA and ten other statistics are applied to the whole genome sequencing dataset from the TwinsUK study. We identify a total of 79 genes with rare variants and 67 genes with common variants significantly associated with the 46 traits using QRFCCA. The results show that the QRFCCA substantially outperforms the ten other statistics.


Subject(s)
Computational Biology/methods , Databases, Genetic , Genetic Pleiotropy/genetics , Models, Statistical , Sequence Analysis, DNA , Algorithms , Computer Simulation , Genome-Wide Association Study , High-Throughput Nucleotide Sequencing , Humans , Phenotype , Polymorphism, Single Nucleotide/genetics , Principal Component Analysis
17.
J Hum Genet ; 62(10): 877-884, 2017 Oct.
Article in English | MEDLINE | ID: mdl-28539665

ABSTRACT

Split hand/foot malformation (SHFM) is a congenital limb deficiency with missing or shortened central digits. Some SHFM genes have been identified but the cause of many SHFM cases is unknown. We used single-nucleotide polymorphism (SNP) microarray analysis to detect copy-number variants (CNVs) in 25 SHFM cases without other birth defects from New York State (NYS), prioritized CNVs absent from population CNV databases, and validated these CNVs using quantitative real-time polymerase chain reaction (qPCR). We tested for the validated CNVs in seven cases from Iowa using qPCR, and also sequenced 36 SHFM candidate genes in all the subjects. Seven NYS cases had a potentially deleterious variant: two had a p.R225H or p.R225L mutation in TP63, one had a 17q25 microdeletion, one had a 10q24 microduplication and three had a 17p13.3 microduplication. In addition, one Iowa case had a de novo 10q24 microduplication. The 17q25 microdeletion has not been reported previously in SHFM and included two SHFM candidate genes (SUMO2 and GRB2), while the 10q24 and 17p13.3 CNVs had breakpoints within genomic regions that contained putative regulatory elements and a limb development gene. In SHFM pathogenesis, the microdeletion may cause haploinsufficiency of SHFM genes and/or deletion of their regulatory regions, and the microduplications could disrupt regulatory elements that control transcription of limb development genes.


Subject(s)
DNA Copy Number Variations , Genetic Association Studies , Limb Deformities, Congenital/genetics , Mutation , Alleles , Chromosome Aberrations , Female , Humans , Limb Deformities, Congenital/diagnosis , Male , Phenotype , Polymorphism, Single Nucleotide , Real-Time Polymerase Chain Reaction , Regulatory Sequences, Nucleic Acid , Reproducibility of Results , Sequence Analysis, DNA
18.
Eur J Hum Genet ; 25(3): 350-359, 2017 02.
Article in English | MEDLINE | ID: mdl-28000696

ABSTRACT

To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data.


Subject(s)
Genetic Pleiotropy , High-Throughput Nucleotide Sequencing/methods , Models, Genetic , Sequence Analysis, DNA/methods , False Positive Reactions , Humans , Linear Models , Multivariate Analysis , Quantitative Trait, Heritable
19.
Genet Epidemiol ; 41(1): 18-34, 2017 Jan.
Article in English | MEDLINE | ID: mdl-27917525

ABSTRACT

In this paper, extensive simulations are performed to compare two statistical methods to analyze multiple correlated quantitative phenotypes: (1) approximate F-distributed tests of multivariate functional linear models (MFLM) and additive models of multivariate analysis of variance (MANOVA), and (2) Gene Association with Multiple Traits (GAMuT) for association testing of high-dimensional genotype data. It is shown that approximate F-distributed tests of MFLM and MANOVA have higher power and are more appropriate for major gene association analysis (i.e., scenarios in which some genetic variants have relatively large effects on the phenotypes); GAMuT has higher power and is more appropriate for analyzing polygenic effects (i.e., effects from a large number of genetic variants each of which contributes a small amount to the phenotypes). MFLM and MANOVA are very flexible and can be used to perform association analysis for (i) rare variants, (ii) common variants, and (iii) a combination of rare and common variants. Although GAMuT was designed to analyze rare variants, it can be applied to analyze a combination of rare and common variants and it performs well when (1) the number of genetic variants is large and (2) each variant contributes a small amount to the phenotypes (i.e., polygenes). MFLM and MANOVA are fixed effect models that perform well for major gene association analysis. GAMuT can be viewed as an extension of sequence kernel association tests (SKAT). Both GAMuT and SKAT are more appropriate for analyzing polygenic effects and they perform well not only in the rare variant case, but also in the case of a combination of rare and common variants. Data analyses of European cohorts and the Trinity Students Study are presented to compare the performance of the two methods.


Subject(s)
Genetic Association Studies , Genetic Markers/genetics , Genetic Variation/genetics , High-Throughput Nucleotide Sequencing/methods , Lipids/genetics , Models, Genetic , Multifactorial Inheritance/genetics , Analysis of Variance , Genome, Human , Genotype , Humans , Lipids/analysis , Phenotype , Quantitative Trait Loci
20.
Birth Defects Res ; 109(1): 8-15, 2017 01 20.
Article in English | MEDLINE | ID: mdl-28009100

ABSTRACT

BACKGROUND: Hypoplastic right heart syndrome (HRHS) is a rare congenital defect characterized by underdevelopment of the right heart structures commonly accompanied by an atrial septal defect. Familial HRHS reports suggest genetic factor involvement. We examined the role of copy number variants (CNVs) in HRHS. METHODS: We genotyped 32 HRHS cases identified from all New York State live births (1998-2005) using Illumina HumanOmni2.5 microarrays. CNVs were called with PennCNV and prioritized if they were ≥20 Kb, contained ≥10 SNPs and had minimal overlap with CNVs from in-house controls, the Database of Genomic Variants, HapMap3, and Childrens Hospital of Philadelphia database. RESULTS: We identified 28 CNVs in 17 cases; several encompassed genes important for right heart development. One case had a 2p16-2p23 duplication spanning LBH, a limb and heart development transcription factor. Lbh mis-expression results in right ventricular hypoplasia and pulmonary valve defects. This duplication also encompassed SOS1, a factor associated with pulmonary valve stenosis in Noonan syndrome. Sos1-/- mice display thin and poorly trabeculated ventricles. In another case, we identified a 1.5 Mb deletion associated with Williams-Beuren syndrome, a disorder that includes valvular malformations. A third case had a 24 Kb deletion upstream of the TGFß ligand ITGB8. Embryos genetically null for Itgb8, and its intracellular interactant Band 4.1B, display lethal cardiac phenotypes. CONCLUSION: To our knowledge, this is the first study of CNVs in HRHS. We identified several rare CNVs that overlap genes related to right ventricular wall and valve development, suggesting that genetics plays a role in HRHS and providing clues for further investigation. Birth Defects Research 109:16-26, 2017. © 2016 Wiley Periodicals, Inc.


Subject(s)
Heart Defects, Congenital/etiology , Heart Defects, Congenital/genetics , Heart Ventricles/abnormalities , Child , Child, Preschool , Comparative Genomic Hybridization/methods , DNA Copy Number Variations/genetics , Databases, Nucleic Acid , Female , Genotype , Heart Defects, Congenital/metabolism , Heart Ventricles/metabolism , Humans , Infant , Integrin beta Chains/genetics , Male , New York , Oligonucleotide Array Sequence Analysis/methods , Phenotype , Philadelphia , Polymorphism, Single Nucleotide/genetics , Sequence Deletion/genetics , Williams Syndrome/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...