Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 44
Filter
1.
Am J Respir Crit Care Med ; 207(10): 1324-1333, 2023 05 15.
Article in English | MEDLINE | ID: mdl-36921087

ABSTRACT

Rationale: Lung disease is the major cause of morbidity and mortality in persons with cystic fibrosis (pwCF). Variability in CF lung disease has substantial non-CFTR (CF transmembrane conductance regulator) genetic influence. Identification of genetic modifiers has prognostic and therapeutic importance. Objectives: Identify genetic modifier loci and genes/pathways associated with pulmonary disease severity. Methods: Whole-genome sequencing data on 4,248 unique pwCF with pancreatic insufficiency and lung function measures were combined with imputed genotypes from an additional 3,592 patients with pancreatic insufficiency from the United States, Canada, and France. This report describes association of approximately 15.9 million SNPs using the quantitative Kulich normal residual mortality-adjusted (KNoRMA) lung disease phenotype in 7,840 pwCF using premodulator lung function data. Measurements and Main Results: Testing included common and rare SNPs, transcriptome-wide association, gene-level, and pathway analyses. Pathway analyses identified novel associations with genes that have key roles in organ development, and we hypothesize that these genes may relate to dysanapsis and/or variability in lung repair. Results confirmed and extended previous genome-wide association study findings. These whole-genome sequencing data provide finely mapped genetic information to support mechanistic studies. No novel primary associations with common single variants or rare variants were found. Multilocus effects at chr5p13 (SLC9A3/CEP72) and chr11p13 (EHF/APIP) were identified. Variant effect size estimates at associated loci were consistently ordered across the cohorts, indicating possible age or birth cohort effects. Conclusions: This premodulator genomic, transcriptomic, and pathway association study of 7,840 pwCF will facilitate mechanistic and postmodulator genetic studies and the development of novel therapeutics for CF lung disease.


Subject(s)
Cystic Fibrosis , Humans , Cystic Fibrosis/genetics , Genome-Wide Association Study/methods , Cystic Fibrosis Transmembrane Conductance Regulator/genetics , Patient Acuity , Lung , Microtubule-Associated Proteins/genetics
2.
HGG Adv ; 3(2): 100090, 2022 Apr 14.
Article in English | MEDLINE | ID: mdl-35128485

ABSTRACT

Cystic fibrosis (CF) is a severe genetic disorder that can cause multiple comorbidities affecting the lungs, the pancreas, the luminal digestive system and beyond. In our previous genome-wide association studies (GWAS), we genotyped approximately 8,000 CF samples using a mixture of different genotyping platforms. More recently, the Cystic Fibrosis Genome Project (CFGP) performed deep (approximately 30×) whole genome sequencing (WGS) of 5,095 samples to better understand the genetic mechanisms underlying clinical heterogeneity among patients with CF. For mixtures of GWAS array and WGS data, genotype imputation has proven effective in increasing effective sample size. Therefore, we first performed imputation for the approximately 8,000 CF samples with GWAS array genotype using the Trans-Omics for Precision Medicine (TOPMed) freeze 8 reference panel. Our results demonstrate that TOPMed can provide high-quality imputation for patients with CF, boosting genomic coverage from approximately 0.3-4.2 million genotyped markers to approximately 11-43 million well-imputed markers, and significantly improving polygenic risk score (PRS) prediction accuracy. Furthermore, we built a CF-specific CFGP reference panel based on WGS data of patients with CF. We demonstrate that despite having approximately 3% the sample size of TOPMed, our CFGP reference panel can still outperform TOPMed when imputing some CF disease-causing variants, likely owing to allele and haplotype differences between patients with CF and general populations. We anticipate our imputed data for 4,656 samples without WGS data will benefit our subsequent genetic association studies, and the CFGP reference panel built from CF WGS samples will benefit other investigators studying CF.

3.
Genes Dev ; 33(19-20): 1381-1396, 2019 10 01.
Article in English | MEDLINE | ID: mdl-31488579

ABSTRACT

Short telomere syndromes manifest as familial idiopathic pulmonary fibrosis; they are the most common premature aging disorders. We used genome-wide linkage to identify heterozygous loss of function of ZCCHC8, a zinc-knuckle containing protein, as a cause of autosomal dominant pulmonary fibrosis. ZCCHC8 associated with TR and was required for telomerase function. In ZCCHC8 knockout cells and in mutation carriers, genomically extended telomerase RNA (TR) accumulated at the expense of mature TR, consistent with a role for ZCCHC8 in mediating TR 3' end targeting to the nuclear RNA exosome. We generated Zcchc8-null mice and found that heterozygotes, similar to human mutation carriers, had TR insufficiency but an otherwise preserved transcriptome. In contrast, Zcchc8-/- mice developed progressive and fatal neurodevelopmental pathology with features of a ciliopathy. The Zcchc8-/- brain transcriptome was highly dysregulated, showing accumulation and 3' end misprocessing of other low-abundance RNAs, including those encoding cilia components as well as the intronless replication-dependent histones. Our data identify a novel cause of human short telomere syndromes-familial pulmonary fibrosis and uncover nuclear exosome targeting as an essential 3' end maturation mechanism that vertebrate TR shares with replication-dependent histones.


Subject(s)
Carrier Proteins/genetics , Idiopathic Pulmonary Fibrosis/genetics , Loss of Function Mutation , Nuclear Proteins/genetics , RNA/metabolism , Telomerase/metabolism , Animals , Brain/enzymology , Brain/physiopathology , Cell Line , Cilia/genetics , Female , Genetic Linkage , HCT116 Cells , Humans , Idiopathic Pulmonary Fibrosis/enzymology , Idiopathic Pulmonary Fibrosis/physiopathology , Male , Mice , Mice, Knockout , Neurodevelopmental Disorders/genetics , Pedigree , RNA Processing, Post-Transcriptional/genetics , Telomere Shortening/genetics
4.
BMC Proc ; 10(Suppl 7): 147-152, 2016.
Article in English | MEDLINE | ID: mdl-27980627

ABSTRACT

Current findings from genetic studies of complex human traits often do not explain a large proportion of the estimated variation of these traits due to genetic factors. This could be, in part, due to overly stringent significance thresholds in traditional statistical methods, such as linear and logistic regression. Machine learning methods, such as Random Forests (RF), are an alternative approach to identify potentially interesting variants. One major issue with these methods is that there is no clear way to distinguish between probable true hits and noise variables based on the importance metric calculated. To this end, we are developing a method called the Relative Recurrency Variable Importance Metric (r2VIM), a RF-based variable selection method. Here, we apply r2VIM to the unrelated Genetic Analysis Workshop 19 data with simulated systolic blood pressure as the phenotype. We compare the number of "true" functional variants identified by r2VIM with those identified by linear regression analyses that use a Bonferroni correction to calculate a significance threshold. Our results show that r2VIM performed comparably to linear regression. Our findings are proof-of-concept for r2VIM, as it identifies a similar number of functional and nonfunctional variants as a more commonly used technique when the optimal importance score threshold is used.

5.
BMC Genet ; 17 Suppl 2: 8, 2016 Feb 03.
Article in English | MEDLINE | ID: mdl-26866982

ABSTRACT

High-density genetic marker data, especially sequence data, imply an immense multiple testing burden. This can be ameliorated by filtering genetic variants, exploiting or accounting for correlations between variants, jointly testing variants, and by incorporating informative priors. Priors can be based on biological knowledge or predicted variant function, or even be used to integrate gene expression or other omics data. Based on Genetic Analysis Workshop (GAW) 19 data, this article discusses diversity and usefulness of functional variant scores provided, for example, by PolyPhen2, SIFT, or RegulomeDB annotations. Incorporating functional scores into variant filters or weights and adjusting the significance level for correlations between variants yielded significant associations with blood pressure traits in a large family study of Mexican Americans (GAW19 data set). Marker rs218966 in gene PHF14 and rs9836027 in MAP4 significantly associated with hypertension; additionally, rare variants in SNUPN significantly associated with systolic blood pressure. Variant weights strongly influenced the power of kernel methods and burden tests. Apart from variant weights in test statistics, prior weights may also be used when combining test statistics or to informatively weight p values while controlling false discovery rate (FDR). Indeed, power improved when gene expression data for FDR-controlled informative weighting of association test p values of genes was used. Finally, approaches exploiting variant correlations included identity-by-descent mapping and the optimal strategy for joint testing rare and common variants, which was observed to depend on linkage disequilibrium structure.


Subject(s)
Genetic Variation , Mexican Americans/genetics , Microtubule-Associated Proteins/genetics , RNA Cap-Binding Proteins/genetics , Receptors, Cytoplasmic and Nuclear/genetics , Transcription Factors/genetics , Blood Pressure/genetics , Genetic Markers/genetics , Humans , Hypertension/genetics , Software
6.
Front Public Health ; 2: 112, 2014.
Article in English | MEDLINE | ID: mdl-25147783

ABSTRACT

BACKGROUND: B vitamins play an important role in homocysteine metabolism, with vitamin deficiencies resulting in increased levels of homocysteine and increased risk for stroke. We performed a genome-wide association study (GWAS) in 2,100 stroke patients from the Vitamin Intervention for Stroke Prevention (VISP) trial, a clinical trial designed to determine whether the daily intake of high-dose folic acid, vitamins B6, and B12 reduce recurrent cerebral infarction. METHODS: Extensive quality control (QC) measures resulted in a total of 737,081 SNPs for analysis. Genome-wide association analyses for baseline quantitative measures of folate, Vitamins B12, and B6 were completed using linear regression approaches, implemented in PLINK. RESULTS: Six associations met or exceeded genome-wide significance (P ≤ 5 × 10(-08)). For baseline Vitamin B12, the strongest association was observed with a non-synonymous SNP (nsSNP) located in the CUBN gene (P = 1.76 × 10(-13)). Two additional CUBN intronic SNPs demonstrated strong associations with B12 (P = 2.92 × 10(-10) and 4.11 × 10(-10)), while a second nsSNP, located in the TCN1 gene, also reached genome-wide significance (P = 5.14 × 10(-11)). For baseline measures of Vitamin B6, we identified genome-wide significant associations for SNPs at the ALPL locus (rs1697421; P = 7.06 × 10(-10) and rs1780316; P = 2.25 × 10(-08)). In addition to the six genome-wide significant associations, nine SNPs (two for Vitamin B6, six for Vitamin B12, and one for folate measures) provided suggestive evidence for association (P ≤ 10(-07)). CONCLUSION: Our GWAS study has identified six genome-wide significant associations, nine suggestive associations, and successfully replicated 5 of 16 SNPs previously reported to be associated with measures of B vitamins. The six genome-wide significant associations are located in gene regions that have shown previous associations with measures of B vitamins; however, four of the nine suggestive associations represent novel finding and warrant further investigation in additional populations.

7.
G3 (Bethesda) ; 3(10): 1795-807, 2013 Oct 03.
Article in English | MEDLINE | ID: mdl-23979933

ABSTRACT

Microarray single-nucleotide polymorphism genotyping, combined with imputation of untyped variants, has been widely adopted as an efficient means to interrogate variation across the human genome. "Genomic coverage" is the total proportion of genomic variation captured by an array, either by direct observation or through an indirect means such as linkage disequilibrium or imputation. We have performed imputation-based genomic coverage assessments of eight current genotyping arrays that assay from ~0.3 to ~5 million variants. Coverage was determined separately in each of the four continental ancestry groups in the 1000 Genomes Project phase 1 release. We used the subset of 1000 Genomes variants present on each array to impute the remaining variants and assessed coverage based on correlation between imputed and observed allelic dosages. More than 75% of common variants (minor allele frequency > 0.05) are covered by all arrays in all groups except for African ancestry, and up to ~90% in all ancestries for the highest density arrays. In contrast, less than 40% of less common variants (0.01 < minor allele frequency < 0.05) are covered by low density arrays in all ancestries and 50-80% in high density arrays, depending on ancestry. We also calculated genome-wide power to detect variant-trait association in a case-control design, across varying sample sizes, effect sizes, and minor allele frequency ranges, and compare these array-based power estimates with a hypothetical array that would type all variants in 1000 Genomes. These imputation-based genomic coverage and power analyses are intended as a practical guide to researchers planning genetic studies.


Subject(s)
Genome, Human , Genotyping Techniques/methods , Oligonucleotide Array Sequence Analysis , Gene Frequency , Genome-Wide Association Study , Humans , Sensitivity and Specificity
8.
Nat Genet ; 45(2): 197-201, 2013 Feb.
Article in English | MEDLINE | ID: mdl-23263489

ABSTRACT

Insulin secretion has a crucial role in glucose homeostasis, and failure to secrete sufficient insulin is a hallmark of type 2 diabetes. Genome-wide association studies (GWAS) have identified loci contributing to insulin processing and secretion; however, a substantial fraction of the genetic contribution remains undefined. To examine low-frequency (minor allele frequency (MAF) 0.5-5%) and rare (MAF < 0.5%) nonsynonymous variants, we analyzed exome array data in 8,229 nondiabetic Finnish males using the Illumina HumanExome Beadchip. We identified low-frequency coding variants associated with fasting proinsulin concentrations at the SGSM2 and MADD GWAS loci and three new genes with low-frequency variants associated with fasting proinsulin or insulinogenic index: TBC1D30, KANK1 and PAM. We also show that the interpretation of single-variant and gene-based tests needs to consider the effects of noncoding SNPs both nearby and megabases away. This study demonstrates that exome array genotyping is a valuable approach to identify low-frequency variants that contribute to complex traits.


Subject(s)
Exome/genetics , Genetic Variation , Insulin/genetics , Insulin/metabolism , Adaptor Proteins, Signal Transducing , Amidine-Lyases/genetics , Cytoskeletal Proteins , Death Domain Receptor Signaling Adaptor Proteins/genetics , Fasting/blood , Finland , Gene Frequency , Genetics, Population , Genotype , Guanine Nucleotide Exchange Factors/genetics , Humans , Insulin Secretion , Intracellular Signaling Peptides and Proteins/genetics , Male , Mixed Function Oxygenases/genetics , Molecular Sequence Annotation , Proinsulin/blood , Tumor Suppressor Proteins/genetics
9.
BMC Oral Health ; 12: 57, 2012 Dec 21.
Article in English | MEDLINE | ID: mdl-23259602

ABSTRACT

BACKGROUND: Over 90% of adults aged 20 years or older with permanent teeth have suffered from dental caries leading to pain, infection, or even tooth loss. Although caries prevalence has decreased over the past decade, there are still about 23% of dentate adults who have untreated carious lesions in the US. Dental caries is a complex disorder affected by both individual susceptibility and environmental factors. Approximately 35-55% of caries phenotypic variation in the permanent dentition is attributable to genes, though few specific caries genes have been identified. Therefore, we conducted the first genome-wide association study (GWAS) to identify genes affecting susceptibility to caries in adults. METHODS: Five independent cohorts were included in this study, totaling more than 7000 participants. For each participant, dental caries was assessed and genetic markers (single nucleotide polymorphisms, SNPs) were genotyped or imputed across the entire genome. Due to the heterogeneity among the five cohorts regarding age, genotyping platform, quality of dental caries assessment, and study design, we first conducted genome-wide association (GWA) analyses on each of the five independent cohorts separately. We then performed three meta-analyses to combine results for: (i) the comparatively younger, Appalachian cohorts (N = 1483) with well-assessed caries phenotype, (ii) the comparatively older, non-Appalachian cohorts (N = 5960) with inferior caries phenotypes, and (iii) all five cohorts (N = 7443). Top ranking genetic loci within and across meta-analyses were scrutinized for biologically plausible roles on caries. RESULTS: Different sets of genes were nominated across the three meta-analyses, especially between the younger and older age cohorts. In general, we identified several suggestive loci (P-value ≤ 10E-05) within or near genes with plausible biological roles for dental caries, including RPS6KA2 and PTK2B, involved in p38-depenedent MAPK signaling, and RHOU and FZD1, involved in the Wnt signaling cascade. Both of these pathways have been implicated in dental caries. ADMTS3 and ISL1 are involved in tooth development, and TLR2 is involved in immune response to oral pathogens. CONCLUSIONS: As the first GWAS for dental caries in adults, this study nominated several novel caries genes for future study, which may lead to better understanding of cariogenesis, and ultimately, to improved disease predictions, prevention, and/or treatment.


Subject(s)
Dental Caries Susceptibility/genetics , Dental Caries/genetics , Genome-Wide Association Study , MAP Kinase Signaling System/genetics , Wnt Signaling Pathway/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Chromosomes, Human/genetics , DMF Index , Dentition, Permanent , Humans , Middle Aged , Young Adult
10.
Nat Genet ; 44(6): 642-50, 2012 May 06.
Article in English | MEDLINE | ID: mdl-22561516

ABSTRACT

We detected clonal mosaicism for large chromosomal anomalies (duplications, deletions and uniparental disomy) using SNP microarray data from over 50,000 subjects recruited for genome-wide association studies. This detection method requires a relatively high frequency of cells with the same abnormal karyotype (>5-10%; presumably of clonal origin) in the presence of normal cells. The frequency of detectable clonal mosaicism in peripheral blood is low (<0.5%) from birth until 50 years of age, after which it rapidly rises to 2-3% in the elderly. Many of the mosaic anomalies are characteristic of those found in hematological cancers and identify common deleted regions with genes previously associated with these cancers. Although only 3% of subjects with detectable clonal mosaicism had any record of hematological cancer before DNA sampling, those without a previous diagnosis have an estimated tenfold higher risk of a subsequent hematological cancer (95% confidence interval = 6-18).


Subject(s)
Aging/genetics , Chromosome Aberrations , Mosaicism , Neoplasms/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Child , Child, Preschool , Chromosome Mapping , DNA Copy Number Variations , Female , Genome-Wide Association Study , Humans , Infant , Infant, Newborn , Male , Middle Aged
11.
Genet Epidemiol ; 35(8): 887-98, 2011 Dec.
Article in English | MEDLINE | ID: mdl-22125226

ABSTRACT

Genome-wide association studies (GWAS) are a useful approach in the study of the genetic components of complex phenotypes. Aside from large cohorts, GWAS have generally been limited to the study of one or a few diseases or traits. The emergence of biobanks linked to electronic medical records (EMRs) allows the efficient reuse of genetic data to yield meaningful genotype-phenotype associations for multiple phenotypes or traits. Phase I of the electronic MEdical Records and GEnomics (eMERGE-I) Network is a National Human Genome Research Institute-supported consortium composed of five sites to perform various genetic association studies using DNA repositories and EMR systems. Each eMERGE site has developed EMR-based algorithms to comprise a core set of 14 phenotypes for extraction of study samples from each site's DNA repository. Each eMERGE site selected samples for a specific phenotype, and these samples were genotyped at either the Broad Institute or at the Center for Inherited Disease Research using the Illumina Infinium BeadChip technology. In all, approximately 17,000 samples from across the five sites were genotyped. A unified quality control (QC) pipeline was developed by the eMERGE Genomics Working Group and used to ensure thorough cleaning of the data. This process includes examination of sample and marker quality and various batch effects. Upon completion of the genotyping and QC analyses for each site's primary study, eMERGE Coordinating Center merged the datasets from all five sites. This larger merged dataset reentered the established eMERGE QC pipeline. Based on lessons learned during the process, additional analyses and QC checkpoints were added to the pipeline to ensure proper merging. Here, we explore the challenges associated with combining datasets from different genotyping centers and describe the expansion to eMERGE QC pipeline for merged datasets. These additional steps will be useful as the eMERGE project expands to include additional sites in eMERGE-II, and also serve as a starting point for investigators merging multiple genotype datasets accessible through the National Center for Biotechnology Information in the database of Genotypes and Phenotypes. Our experience demonstrates that merging multiple datasets after additional QC can be an efficient use of genotype data despite new challenges that appear in the process.


Subject(s)
Electronic Health Records , Genome-Wide Association Study/standards , Quality Control , Algorithms , Genotype , Humans , National Human Genome Research Institute (U.S.) , Phenotype , United States
12.
Hum Mol Genet ; 20(24): 5012-23, 2011 Dec 15.
Article in English | MEDLINE | ID: mdl-21926416

ABSTRACT

We performed a multistage genome-wide association study of melanoma. In a discovery cohort of 1804 melanoma cases and 1026 controls, we identified loci at chromosomes 15q13.1 (HERC2/OCA2 region) and 16q24.3 (MC1R) regions that reached genome-wide significance within this study and also found strong evidence for genetic effects on susceptibility to melanoma from markers on chromosome 9p21.3 in the p16/ARF region and on chromosome 1q21.3 (ARNT/LASS2/ANXA9 region). The most significant single-nucleotide polymorphisms (SNPs) in the 15q13.1 locus (rs1129038 and rs12913832) lie within a genomic region that has profound effects on eye and skin color; notably, 50% of variability in eye color is associated with variation in the SNP rs12913832. Because eye and skin colors vary across European populations, we further evaluated the associations of the significant SNPs after carefully adjusting for European substructure. We also evaluated the top 10 most significant SNPs by using data from three other genome-wide scans. Additional in silico data provided replication of the findings from the most significant region on chromosome 1q21.3 rs7412746 (P = 6 × 10(-10)). Together, these data identified several candidate genes for additional studies to identify causal variants predisposing to increased risk for developing melanoma.


Subject(s)
Genetic Loci/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Melanoma/genetics , Skin Neoplasms/genetics , Case-Control Studies , Chromosomes, Human, Pair 1/genetics , Genetic Markers , Guanine Nucleotide Exchange Factors/genetics , Humans , Meta-Analysis as Topic , Pigmentation/genetics , Polymorphism, Single Nucleotide/genetics , Ubiquitin-Protein Ligases
13.
Genet Epidemiol ; 35(6): 469-78, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21618603

ABSTRACT

Nonsyndromic cleft palate (CP) is a common birth defect with a complex and heterogeneous etiology involving both genetic and environmental risk factors. We conducted a genome-wide association study (GWAS) using 550 case-parent trios, ascertained through a CP case collected in an international consortium. Family-based association tests of single nucleotide polymorphisms (SNP) and three common maternal exposures (maternal smoking, alcohol consumption, and multivitamin supplementation) were used in a combined 2 df test for gene (G) and gene-environment (G × E) interaction simultaneously, plus a separate 1 df test for G × E interaction alone. Conditional logistic regression models were used to estimate effects on risk to exposed and unexposed children. While no SNP achieved genome-wide significance when considered alone, markers in several genes attained or approached genome-wide significance when G × E interaction was included. Among these, MLLT3 and SMC2 on chromosome 9 showed multiple SNPs resulting in an increased risk if the mother consumed alcohol during the peri-conceptual period (3 months prior to conception through the first trimester). TBK1 on chr. 12 and ZNF236 on chr. 18 showed multiple SNPs associated with higher risk of CP in the presence of maternal smoking. Additional evidence of reduced risk due to G × E interaction in the presence of multivitamin supplementation was observed for SNPs in BAALC on chr. 8. These results emphasize the need to consider G × E interaction when searching for genes influencing risk to complex and heterogeneous disorders, such as nonsyndromic CP.


Subject(s)
Cleft Palate/genetics , Alcohol Drinking , Chromosome Mapping , Cleft Palate/chemically induced , Cleft Palate/etiology , Female , Gene-Environment Interaction , Genome-Wide Association Study , Genotype , Humans , Male , Maternal Exposure , Models, Genetic , Parents , Polymorphism, Single Nucleotide , Pregnancy , Risk , Vitamins/therapeutic use
14.
Curr Protoc Hum Genet ; Chapter 1: Unit1.19, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21234875

ABSTRACT

Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of complex disease. Regardless of context, the practical utility of this information will ultimately depend upon the quality of the original data. Quality control (QC) procedures for GWAS are computationally intensive, operationally challenging, and constantly evolving. Here we enumerate some of the challenges in QC of GWAS data and describe the approaches that the electronic MEdical Records and Genomics (eMERGE) network is using for quality assurance in GWAS data, thereby minimizing potential bias and error in GWAS results. We discuss common issues associated with QC of GWAS data, including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We propose best practices and discuss areas of ongoing and future research.


Subject(s)
Genome-Wide Association Study/standards , Software , Electronic Health Records , Genome-Wide Association Study/methods , Genomics , Genotype , Humans , Phenotype , Quality Control
15.
G3 (Bethesda) ; 1(6): 505-14, 2011 Nov.
Article in English | MEDLINE | ID: mdl-22384361

ABSTRACT

Ischemic stroke (IS) is among the leading causes of death in Western countries. There is a significant genetic component to IS susceptibility, especially among young adults. To date, research to identify genetic loci predisposing to stroke has met only with limited success. We performed a genome-wide association (GWA) analysis of early-onset IS to identify potential stroke susceptibility loci. The GWA analysis was conducted by genotyping 1 million SNPs in a biracial population of 889 IS cases and 927 controls, ages 15-49 years. Genotypes were imputed using the HapMap3 reference panel to provide 1.4 million SNPs for analysis. Logistic regression models adjusting for age, recruitment stages, and population structure were used to determine the association of IS with individual SNPs. Although no single SNP reached genome-wide significance (P < 5 × 10(-8)), we identified two SNPs in chromosome 2q23.3, rs2304556 (in FMNL2; P = 1.2 × 10(-7)) and rs1986743 (in ARL6IP6; P = 2.7 × 10(-7)), strongly associated with early-onset stroke. These data suggest that a novel locus on human chromosome 2q23.3 may be associated with IS susceptibility among young adults.

17.
Genet Epidemiol ; 34(6): 591-602, 2010 Sep.
Article in English | MEDLINE | ID: mdl-20718045

ABSTRACT

Genome-wide scans of nucleotide variation in human subjects are providing an increasing number of replicated associations with complex disease traits. Most of the variants detected have small effects and, collectively, they account for a small fraction of the total genetic variance. Very large sample sizes are required to identify and validate findings. In this situation, even small sources of systematic or random error can cause spurious results or obscure real effects. The need for careful attention to data quality has been appreciated for some time in this field, and a number of strategies for quality control and quality assurance (QC/QA) have been developed. Here we extend these methods and describe a system of QC/QA for genotypic data in genome-wide association studies (GWAS). This system includes some new approaches that (1) combine analysis of allelic probe intensities and called genotypes to distinguish gender misidentification from sex chromosome aberrations, (2) detect autosomal chromosome aberrations that may affect genotype calling accuracy, (3) infer DNA sample quality from relatedness and allelic intensities, (4) use duplicate concordance to infer SNP quality, (5) detect genotyping artifacts from dependence of Hardy-Weinberg equilibrium test P-values on allelic frequency, and (6) demonstrate sensitivity of principal components analysis to SNP selection. The methods are illustrated with examples from the "Gene Environment Association Studies" (GENEVA) program. The results suggest several recommendations for QC/QA in the design and execution of GWAS.


Subject(s)
Genome-Wide Association Study/standards , Genotype , Aneuploidy , Artifacts , Case-Control Studies , Chromosome Aberrations , Female , Gene Frequency , Genetic Variation , Genetics, Population , Genome-Wide Association Study/methods , Humans , Lung Neoplasms/genetics , Male , Polymorphism, Single Nucleotide , Quality Control , Sex Chromosome Aberrations/statistics & numerical data , Substance-Related Disorders/genetics
18.
Nat Genet ; 42(6): 525-9, 2010 Jun.
Article in English | MEDLINE | ID: mdl-20436469

ABSTRACT

Case-parent trios were used in a genome-wide association study of cleft lip with and without cleft palate. SNPs near two genes not previously associated with cleft lip with and without cleft palate (MAFB, most significant SNP rs13041247, with odds ratio (OR) per minor allele = 0.704, 95% CI 0.635-0.778, P = 1.44 x 10(-11); and ABCA4, most significant SNP rs560426, with OR = 1.432, 95% CI 1.292-1.587, P = 5.01 x 10(-12)) and two previously identified regions (at chromosome 8q24 and IRF6) attained genome-wide significance. Stratifying trios into European and Asian ancestry groups revealed differences in statistical significance, although estimated effect sizes remained similar. Replication studies from several populations showed confirming evidence, with families of European ancestry giving stronger evidence for markers in 8q24, whereas Asian families showed stronger evidence for association with MAFB and ABCA4. Expression studies support a role for MAFB in palatal development.


Subject(s)
ATP-Binding Cassette Transporters/genetics , Cleft Lip/genetics , Cleft Palate/genetics , Genetic Predisposition to Disease , MafB Transcription Factor/genetics , Polymorphism, Single Nucleotide , Animals , Asian People/genetics , Female , Genome-Wide Association Study , Genotype , Humans , Mice , White People/genetics
19.
Proc Natl Acad Sci U S A ; 107(16): 7401-6, 2010 Apr 20.
Article in English | MEDLINE | ID: mdl-20385819

ABSTRACT

We executed a genome-wide association scan for age-related macular degeneration (AMD) in 2,157 cases and 1,150 controls. Our results validate AMD susceptibility loci near CFH (P < 10(-75)), ARMS2 (P < 10(-59)), C2/CFB (P < 10(-20)), C3 (P < 10(-9)), and CFI (P < 10(-6)). We compared our top findings with the Tufts/Massachusetts General Hospital genome-wide association study of advanced AMD (821 cases, 1,709 controls) and genotyped 30 promising markers in additional individuals (up to 7,749 cases and 4,625 controls). With these data, we identified a susceptibility locus near TIMP3 (overall P = 1.1 x 10(-11)), a metalloproteinase involved in degradation of the extracellular matrix and previously implicated in early-onset maculopathy. In addition, our data revealed strong association signals with alleles at two loci (LIPC, P = 1.3 x 10(-7); CETP, P = 7.4 x 10(-7)) that were previously associated with high-density lipoprotein cholesterol (HDL-c) levels in blood. Consistent with the hypothesis that HDL metabolism is associated with AMD pathogenesis, we also observed association with AMD of HDL-c-associated alleles near LPL (P = 3.0 x 10(-3)) and ABCA1 (P = 5.6 x 10(-4)). Multilocus analysis including all susceptibility loci showed that 329 of 331 individuals (99%) with the highest-risk genotypes were cases, and 85% of these had advanced AMD. Our studies extend the catalog of AMD associated loci, help identify individuals at high risk of disease, and provide clues about underlying cellular pathways that should eventually lead to new therapies.


Subject(s)
Genetic Predisposition to Disease , Lipoproteins, HDL/metabolism , Macular Degeneration/genetics , Tissue Inhibitor of Metalloproteinase-3/genetics , Alleles , Case-Control Studies , Chromosome Mapping , Complement Factor I/genetics , Genetic Variation , Genome-Wide Association Study , Genotype , Humans , Polymorphism, Single Nucleotide , Regression Analysis , Risk , Tissue Inhibitor of Metalloproteinase-3/physiology
20.
Genet Epidemiol ; 34(4): 364-72, 2010 May.
Article in English | MEDLINE | ID: mdl-20091798

ABSTRACT

Genome-wide association studies (GWAS) have emerged as powerful means for identifying genetic loci related to complex diseases. However, the role of environment and its potential to interact with key loci has not been adequately addressed in most GWAS. Networks of collaborative studies involving different study populations and multiple phenotypes provide a powerful approach for addressing the challenges in analysis and interpretation shared across studies. The Gene, Environment Association Studies (GENEVA) consortium was initiated to: identify genetic variants related to complex diseases; identify variations in gene-trait associations related to environmental exposures; and ensure rapid sharing of data through the database of Genotypes and Phenotypes. GENEVA consists of several academic institutions, including a coordinating center, two genotyping centers and 14 independently designed studies of various phenotypes, as well as several Institutes and Centers of the National Institutes of Health led by the National Human Genome Research Institute. Minimum detectable effect sizes include relative risks ranging from 1.24 to 1.57 and proportions of variance explained ranging from 0.0097 to 0.02. Given the large number of research participants (N>80,000), an important feature of GENEVA is harmonization of common variables, which allow analyses of additional traits. Environmental exposure information available from most studies also enables testing of gene-environment interactions. Facilitated by its sizeable infrastructure for promoting collaboration, GENEVA has established a unified framework for genotyping, data quality control, analysis and interpretation. By maximizing knowledge obtained through collaborative GWAS incorporating environmental exposure information, GENEVA aims to enhance our understanding of disease etiology, potentially identifying opportunities for intervention.


Subject(s)
Genome-Wide Association Study , Environment , Genotype , Humans , Models, Genetic , Molecular Epidemiology/methods , Phenotype , Polymorphism, Genetic , Population Groups , Quality Control , Quantitative Trait Loci , Risk
SELECTION OF CITATIONS
SEARCH DETAIL
...