Search | VHL Regional Portal

1.

Genome-wide study of resistant hypertension identified from electronic health records.

Dumitrescu, Logan; Ritchie, Marylyn D; Denny, Joshua C; El Rouby, Nihal M; McDonough, Caitrin W; Bradford, Yuki; Ramirez, Andrea H; Bielinski, Suzette J; Basford, Melissa A; Chai, High Seng; Peissig, Peggy; Carrell, David; Pathak, Jyotishman; Rasmussen, Luke V; Wang, Xiaoming; Pacheco, Jennifer A; Kho, Abel N; Hayes, M Geoffrey; Matsumoto, Martha; Smith, Maureen E; Li, Rongling; Cooper-DeHoff, Rhonda M; Kullo, Iftikhar J; Chute, Christopher G; Chisholm, Rex L; Jarvik, Gail P; Larson, Eric B; Carey, David; McCarty, Catherine A; Williams, Marc S; Roden, Dan M; Bottinger, Erwin; Johnson, Julie A; de Andrade, Mariza; Crawford, Dana C.

PLoS One ; 12(2): e0171745, 2017.

Article in English | MEDLINE | ID: mdl-28222112

ABSTRACT

Resistant hypertension is defined as high blood pressure that remains above treatment goals in spite of the concurrent use of three antihypertensive agents from different classes. Despite the important health consequences of resistant hypertension, few studies of resistant hypertension have been conducted. To perform a genome-wide association study for resistant hypertension, we defined and identified cases of resistant hypertension and hypertensives with treated, controlled hypertension among >47,500 adults residing in the US linked to electronic health records (EHRs) and genotyped as part of the electronic MEdical Records & GEnomics (eMERGE) Network. Electronic selection logic using billing codes, laboratory values, text queries, and medication records was used to identify resistant hypertension cases and controls at each site, and a total of 3,006 cases of resistant hypertension and 876 controlled hypertensives were identified among eMERGE Phase I and II sites. After imputation and quality control, a total of 2,530,150 SNPs were tested for an association among 2,830 multi-ethnic cases of resistant hypertension and 876 controlled hypertensives. No test of association was genome-wide significant in the full dataset or in the dataset limited to European American cases (n = 1,719) and controls (n = 708). The most significant finding was CLNK rs13144136 at p = 1.00x10-6 (odds ratio = 0.68; 95% CI = 0.58-0.80) in the full dataset with similar results in the European American only dataset. We also examined whether SNPs known to influence blood pressure or hypertension also influenced resistant hypertension. None was significant after correction for multiple testing. These data highlight both the difficulties and the potential utility of EHR-linked genomic data to study clinically-relevant traits such as resistant hypertension.

Subject(s)

Antihypertensive Agents/therapeutic use , Drug Resistance/genetics , Electronic Health Records , Genome-Wide Association Study , Hypertension/genetics , Adult , Aged , Algorithms , Blood Pressure/genetics , Case-Control Studies , Computer Communication Networks , Datasets as Topic , Ethnicity/genetics , Genotype , Humans , Hypertension/drug therapy , Hypertension/epidemiology , Male , Middle Aged , Polymorphism, Single Nucleotide , Risk Factors

2.

Human whole genome genotype and transcriptome data for Alzheimer's and other neurodegenerative diseases.

Allen, Mariet; Carrasquillo, Minerva M; Funk, Cory; Heavner, Benjamin D; Zou, Fanggeng; Younkin, Curtis S; Burgess, Jeremy D; Chai, High-Seng; Crook, Julia; Eddy, James A; Li, Hongdong; Logsdon, Ben; Peters, Mette A; Dang, Kristen K; Wang, Xue; Serie, Daniel; Wang, Chen; Nguyen, Thuy; Lincoln, Sarah; Malphrus, Kimberly; Bisceglio, Gina; Li, Ma; Golde, Todd E; Mangravite, Lara M; Asmann, Yan; Price, Nathan D; Petersen, Ronald C; Graff-Radford, Neill R; Dickson, Dennis W; Younkin, Steven G; Ertekin-Taner, Nilüfer.

Sci Data ; 3: 160089, 2016 Oct 11.

Article in English | MEDLINE | ID: mdl-27727239

ABSTRACT

Previous genome-wide association studies (GWAS), conducted by our group and others, have identified loci that harbor risk variants for neurodegenerative diseases, including Alzheimer's disease (AD). Human disease variants are enriched for polymorphisms that affect gene expression, including some that are known to associate with expression changes in the brain. Postulating that many variants confer risk to neurodegenerative disease via transcriptional regulatory mechanisms, we have analyzed gene expression levels in the brain tissue of subjects with AD and related diseases. Herein, we describe our collective datasets comprised of GWAS data from 2,099 subjects; microarray gene expression data from 773 brain samples, 186 of which also have RNAseq; and an independent cohort of 556 brain samples with RNAseq. We expect that these datasets, which are available to all qualified researchers, will enable investigators to explore and identify transcriptional mechanisms contributing to neurodegenerative diseases.

Subject(s)

Alzheimer Disease/genetics , Genome, Human , Neurodegenerative Diseases/genetics , Transcriptome , Genome-Wide Association Study , Humans

3.

Late-onset Alzheimer disease risk variants mark brain regulatory loci.

Allen, Mariet; Kachadoorian, Michaela; Carrasquillo, Minerva M; Karhade, Aditya; Manly, Lester; Burgess, Jeremy D; Wang, Chen; Serie, Daniel; Wang, Xue; Siuda, Joanna; Zou, Fanggeng; Chai, High Seng; Younkin, Curtis; Crook, Julia; Medway, Christopher; Nguyen, Thuy; Ma, Li; Malphrus, Kimberly; Lincoln, Sarah; Petersen, Ronald C; Graff-Radford, Neill R; Asmann, Yan W; Dickson, Dennis W; Younkin, Steven G; Ertekin-Taner, Nilüfer.

Neurol Genet ; 1(2): e15, 2015 Aug.

Article in English | MEDLINE | ID: mdl-27066552

ABSTRACT

OBJECTIVE: To investigate the top late-onset Alzheimer disease (LOAD) risk loci detected or confirmed by the International Genomics of Alzheimer's Project for association with brain gene expression levels to identify variants that influence Alzheimer disease (AD) risk through gene expression regulation. METHODS: Expression levels from the cerebellum (CER) and temporal cortex (TCX) were obtained using Illumina whole-genome cDNA-mediated annealing, selection, extension, and ligation assay (WG-DASL) for â¼400 autopsied patients (â¼200 with AD and â¼200 with non-AD pathologies). We tested 12 significant LOAD genome-wide association study (GWAS) index single nucleotide polymorphisms (SNPs) for cis association with levels of 34 genes within ±100 kb. We also evaluated brain levels of 14 LOAD GWAS candidate genes for association with 1,899 cis-SNPs. Significant associations were validated in a subset of TCX samples using next-generation RNA sequencing (RNAseq). RESULTS: We identified strong associations of brain CR1, HLA-DRB1, and PILRB levels with LOAD GWAS index SNPs. We also detected other strong cis-SNPs for LOAD candidate genes MEF2C, ZCWPW1, and SLC24A4. MEF2C and SLC24A4, but not ZCWPW1 cis-SNPs, also associate with LOAD risk, independent of the index SNPs. The TCX expression associations could be validated with RNAseq for CR1, HLA-DRB1, ZCWPW1, and SLC24A4. CONCLUSIONS: Our results suggest that some LOAD GWAS variants mark brain regulatory loci, nominate genes under regulation by LOAD risk variants, and annotate these variants for their brain regulatory effects.

4.

Association of MAPT haplotypes with Alzheimer's disease risk and MAPT brain gene expression levels.

Allen, Mariet; Kachadoorian, Michaela; Quicksall, Zachary; Zou, Fanggeng; Chai, High Seng; Younkin, Curtis; Crook, Julia E; Pankratz, V Shane; Carrasquillo, Minerva M; Krishnan, Siddharth; Nguyen, Thuy; Ma, Li; Malphrus, Kimberly; Lincoln, Sarah; Bisceglio, Gina; Kolbert, Christopher P; Jen, Jin; Mukherjee, Shubhabrata; Kauwe, John K; Crane, Paul K; Haines, Jonathan L; Mayeux, Richard; Pericak-Vance, Margaret A; Farrer, Lindsay A; Schellenberg, Gerard D; Parisi, Joseph E; Petersen, Ronald C; Graff-Radford, Neill R; Dickson, Dennis W; Younkin, Steven G; Ertekin-Taner, Nilüfer.

Alzheimers Res Ther ; 6(4): 39, 2014.

Article in English | MEDLINE | ID: mdl-25324900

ABSTRACT

INTRODUCTION: MAPT encodes for tau, the predominant component of neurofibrillary tangles that are neuropathological hallmarks of Alzheimer's disease (AD). Genetic association of MAPT variants with late-onset AD (LOAD) risk has been inconsistent, although insufficient power and incomplete assessment of MAPT haplotypes may account for this. METHODS: We examined the association of MAPT haplotypes with LOAD risk in more than 20,000 subjects (n-cases = 9,814, n-controls = 11,550) from Mayo Clinic (n-cases = 2,052, n-controls = 3,406) and the Alzheimer's Disease Genetics Consortium (ADGC, n-cases = 7,762, n-controls = 8,144). We also assessed associations with brain MAPT gene expression levels measured in the cerebellum (n = 197) and temporal cortex (n = 202) of LOAD subjects. Six single nucleotide polymorphisms (SNPs) which tag MAPT haplotypes with frequencies greater than 1% were evaluated. RESULTS: H2-haplotype tagging rs8070723-G allele associated with reduced risk of LOAD (odds ratio, OR = 0.90, 95% confidence interval, CI = 0.85-0.95, p = 5.2E-05) with consistent results in the Mayo (OR = 0.81, p = 7.0E-04) and ADGC (OR = 0.89, p = 1.26E-04) cohorts. rs3785883-A allele was also nominally significantly associated with LOAD risk (OR = 1.06, 95% CI = 1.01-1.13, p = 0.034). Haplotype analysis revealed significant global association with LOAD risk in the combined cohort (p = 0.033), with significant association of the H2 haplotype with reduced risk of LOAD as expected (p = 1.53E-04) and suggestive association with additional haplotypes. MAPT SNPs and haplotypes also associated with brain MAPT levels in the cerebellum and temporal cortex of AD subjects with the strongest associations observed for the H2 haplotype and reduced brain MAPT levels (ß = -0.16 to -0.20, p = 1.0E-03 to 3.0E-03). CONCLUSIONS: These results confirm the previously reported MAPT H2 associations with LOAD risk in two large series, that this haplotype has the strongest effect on brain MAPT expression amongst those tested and identify additional haplotypes with suggestive associations, which require replication in independent series. These biologically congruent results provide compelling evidence to screen the MAPT region for regulatory variants which confer LOAD risk by influencing its brain gene expression.

5.

An integrated model of the transcriptome of HER2-positive breast cancer.

Kalari, Krishna R; Necela, Brian M; Tang, Xiaojia; Thompson, Kevin J; Lau, Melissa; Eckel-Passow, Jeanette E; Kachergus, Jennifer M; Anderson, S Keith; Sun, Zhifu; Baheti, Saurabh; Carr, Jennifer M; Baker, Tiffany R; Barman, Poulami; Radisky, Derek C; Joseph, Richard W; McLaughlin, Sarah A; Chai, High-seng; Camille, Stephan; Rossell, David; Asmann, Yan W; Thompson, E Aubrey; Perez, Edith A.

PLoS One ; 8(11): e79298, 2013.

Article in English | MEDLINE | ID: mdl-24223926

ABSTRACT

Our goal in these analyses was to use genomic features from a test set of primary breast tumors to build an integrated transcriptome landscape model that makes relevant hypothetical predictions about the biological and/or clinical behavior of HER2-positive breast cancer. We interrogated RNA-Seq data from benign breast lesions, ER+, triple negative, and HER2-positive tumors to identify 685 differentially expressed genes, 102 alternatively spliced genes, and 303 genes that expressed single nucleotide sequence variants (eSNVs) that were associated with the HER2-positive tumors in our survey panel. These features were integrated into a transcriptome landscape model that identified 12 highly interconnected genomic modules, each of which represents a cellular processes pathway that appears to define the genomic architecture of the HER2-positive tumors in our test set. The generality of the model was confirmed by the observation that several key pathways were enriched in HER2-positive TCGA breast tumors. The ability of this model to make relevant predictions about the biology of breast cancer cells was established by the observation that integrin signaling was linked to lapatinib sensitivity in vitro and strongly associated with risk of relapse in the NCCTG N9831 adjuvant trastuzumab clinical trial dataset. Additional modules from the HER2 transcriptome model, including ubiquitin-mediated proteolysis, TGF-beta signaling, RHO-family GTPase signaling, and M-phase progression, were linked to response to lapatinib and paclitaxel in vitro and/or risk of relapse in the N9831 dataset. These data indicate that an integrated transcriptome landscape model derived from a test set of HER2-positive breast tumors has potential for predicting outcome and for identifying novel potential therapeutic strategies for this breast cancer subtype.

Subject(s)

Breast Neoplasms/genetics , Breast Neoplasms/metabolism , Models, Biological , Receptor, ErbB-2/metabolism , Transcriptome , Base Sequence , Breast Neoplasms/drug therapy , Breast Neoplasms/pathology , Cell Line, Tumor , Genomics , Humans , Molecular Targeted Therapy

6.

Elevated cardiac troponin T levels in critically ill patients with sepsis.

Vasile, Vlad C; Chai, High-Seng; Abdeldayem, Doaa; Afessa, Bekele; Jaffe, Allan S.

Am J Med ; 126(12): 1114-21, 2013 Dec.

Article in English | MEDLINE | ID: mdl-24083646

ABSTRACT

BACKGROUND: It is known that troponin elevations have prognostic importance in critically ill patients. We examined whether cardiac troponin T elevations are independently associated with in-hospital, short-term (30 days), and long-term (3 years) mortality in intensive care unit (ICU) patients admitted with sepsis, severe sepsis, and septic shock after adjusting for the severity of disease with the Acute Physiology, Age and Chronic Health Evaluation III system. METHODS: We studied the Mayo Clinic's Acute Physiology, Age and Chronic Health Evaluation III database and cardiac troponin T levels from patients admitted consecutively to the medical ICU. Between January 2001 and December 2006, 926 patients with sepsis had cardiac troponin T measured at ICU admission. In-hospital, short-term, and long-term all-cause mortality were determined. RESULTS: Among study patients, 645 (69.7%) had elevated cardiac troponin T levels and 281 (30.3%) had undetectable cardiac troponin T. During hospitalization, 15% of the patients with troponin T <0.01 ng/mL died compared with 31.9% of those with troponin T ≥ 0.01 ng/mL (P < .0001). At 30 days, mortality was 31% and 17% in patients with and without elevations, respectively (P < .0001). The Kaplan-Meier probability of survival at 1-, 2-, and 3-year follow-ups was 68.1%, 56.3%, and 46.8% with troponin T ≥ 0.01 ng/mL, respectively, and 76.4%, 69.1%, and 62.0% with troponin T <0.01 µg/L, respectively (P < .0001). After adjustment for severity of disease and baseline characteristics, cardiac troponin T levels remained associated with in-hospital and short-term mortality but not with long-term mortality. CONCLUSIONS: In patients with sepsis who are admitted to an ICU, cardiac troponin T elevations are independently associated with in-hospital and short-term mortality but not long-term mortality.

Subject(s)

Critical Illness/mortality , Sepsis/blood , Sepsis/mortality , Troponin T/blood , Aged , Female , Humans , Male , Middle Aged , Retrospective Studies

7.

Response to genomic association analysis identifies multiple loci influencing antihypertensive response to an angiotensin II receptor blocker.

Chai, High Seng; Chapman, Arlene B; Boerwinkle, Eric.

Hypertension ; 61(1): e6, 2013 Jan.

Article in English | MEDLINE | ID: mdl-23362515

Subject(s)

Angiotensin II Type 1 Receptor Blockers/therapeutic use , Genome-Wide Association Study/methods , Hypertension/drug therapy , Hypertension/genetics , Polymorphism, Single Nucleotide , Female , Humans , Male

8.

Brain expression genome-wide association study (eGWAS) identifies human disease-associated variants.

Zou, Fanggeng; Chai, High Seng; Younkin, Curtis S; Allen, Mariet; Crook, Julia; Pankratz, V Shane; Carrasquillo, Minerva M; Rowley, Christopher N; Nair, Asha A; Middha, Sumit; Maharjan, Sooraj; Nguyen, Thuy; Ma, Li; Malphrus, Kimberly G; Palusak, Ryan; Lincoln, Sarah; Bisceglio, Gina; Georgescu, Constantin; Kouri, Naomi; Kolbert, Christopher P; Jen, Jin; Haines, Jonathan L; Mayeux, Richard; Pericak-Vance, Margaret A; Farrer, Lindsay A; Schellenberg, Gerard D; Petersen, Ronald C; Graff-Radford, Neill R; Dickson, Dennis W; Younkin, Steven G; Ertekin-Taner, Nilüfer.

PLoS Genet ; 8(6): e1002707, 2012.

Article in English | MEDLINE | ID: mdl-22685416

ABSTRACT

Genetic variants that modify brain gene expression may also influence risk for human diseases. We measured expression levels of 24,526 transcripts in brain samples from the cerebellum and temporal cortex of autopsied subjects with Alzheimer's disease (AD, cerebellar n=197, temporal cortex n=202) and with other brain pathologies (non-AD, cerebellar n=177, temporal cortex n=197). We conducted an expression genome-wide association study (eGWAS) using 213,528 cisSNPs within ± 100 kb of the tested transcripts. We identified 2,980 cerebellar cisSNP/transcript level associations (2,596 unique cisSNPs) significant in both ADs and non-ADs (q<0.05, p=7.70 × 10(-5)-1.67 × 10(-82)). Of these, 2,089 were also significant in the temporal cortex (p=1.85 × 10(-5)-1.70 × 10(-141)). The top cerebellar cisSNPs had 2.4-fold enrichment for human disease-associated variants (p<10(-6)). We identified novel cisSNP/transcript associations for human disease-associated variants, including progressive supranuclear palsy SLCO1A2/rs11568563, Parkinson's disease (PD) MMRN1/rs6532197, Paget's disease OPTN/rs1561570; and we confirmed others, including PD MAPT/rs242557, systemic lupus erythematosus and ulcerative colitis IRF5/rs4728142, and type 1 diabetes mellitus RPS26/rs1701704. In our eGWAS, there was 2.9-3.3 fold enrichment (p<10(-6)) of significant cisSNPs with suggestive AD-risk association (p<10(-3)) in the Alzheimer's Disease Genetics Consortium GWAS. These results demonstrate the significant contributions of genetic factors to human brain gene expression, which are reliably detected across different brain regions and pathologies. The significant enrichment of brain cisSNPs among disease-associated variants advocates gene expression changes as a mechanism for many central nervous system (CNS) and non-CNS diseases. Combined assessment of expression and disease GWAS may provide complementary information in discovery of human disease variants with functional implications. Our findings have implications for the design and interpretation of eGWAS in general and the use of brain expression quantitative trait loci in the study of human disease genetics.

Subject(s)

Alzheimer Disease/genetics , Gene Expression Regulation , Genome-Wide Association Study , Temporal Lobe , Autopsy , Genetic Predisposition to Disease , Genotype , Humans , Polymorphism, Single Nucleotide , RNA/genetics , Temporal Lobe/metabolism

9.

Deep Sequence Analysis of Non-Small Cell Lung Cancer: Integrated Analysis of Gene Expression, Alternative Splicing, and Single Nucleotide Variations in Lung Adenocarcinomas with and without Oncogenic KRAS Mutations.

Kalari, Krishna R; Rossell, David; Necela, Brian M; Asmann, Yan W; Nair, Asha; Baheti, Saurabh; Kachergus, Jennifer M; Younkin, Curtis S; Baker, Tiffany; Carr, Jennifer M; Tang, Xiaojia; Walsh, Michael P; Chai, High-Seng; Sun, Zhifu; Hart, Steven N; Leontovich, Alexey A; Hossain, Asif; Kocher, Jean-Pierre; Perez, Edith A; Reisman, David N; Fields, Alan P; Thompson, E Aubrey.

Front Oncol ; 2: 12, 2012.

Article in English | MEDLINE | ID: mdl-22655260

ABSTRACT

KRAS mutations are highly prevalent in non-small cell lung cancer (NSCLC), and tumors harboring these mutations tend to be aggressive and resistant to chemotherapy. We used next-generation sequencing technology to identify pathways that are specifically altered in lung tumors harboring a KRAS mutation. Paired-end RNA-sequencing of 15 primary lung adenocarcinoma tumors (8 harboring mutant KRAS and 7 with wild-type KRAS) were performed. Sequences were mapped to the human genome, and genomic features, including differentially expressed genes, alternate splicing isoforms and single nucleotide variants, were determined for tumors with and without KRAS mutation using a variety of computational methods. Network analysis was carried out on genes showing differential expression (374 genes), alternate splicing (259 genes), and SNV-related changes (65 genes) in NSCLC tumors harboring a KRAS mutation. Genes exhibiting two or more connections from the lung adenocarcinoma network were used to carry out integrated pathway analysis. The most significant signaling pathways identified through this analysis were the NFκB, ERK1/2, and AKT pathways. A 27 gene mutant KRAS-specific sub network was extracted based on gene-gene connections from the integrated network, and interrogated for druggable targets. Our results confirm previous evidence that mutant KRAS tumors exhibit activated NFκB, ERK1/2, and AKT pathways and may be preferentially sensitive to target therapeutics toward these pathways. In addition, our analysis indicates novel, previously unappreciated links between mutant KRAS and the TNFR and PPARÎ³ signaling pathways, suggesting that targeted PPARÎ³ antagonists and TNFR inhibitors may be useful therapeutic strategies for treatment of mutant KRAS lung tumors. Our study is the first to integrate genomic features from RNA-Seq data from NSCLC and to define a first draft genomic landscape model that is unique to tumors with oncogenic KRAS mutations.

10.

Novel late-onset Alzheimer disease loci variants associate with brain gene expression.

Allen, Mariet; Zou, Fanggeng; Chai, High Seng; Younkin, Curtis S; Crook, Julia; Pankratz, V Shane; Carrasquillo, Minerva M; Rowley, Christopher N; Nair, Asha A; Middha, Sumit; Maharjan, Sooraj; Nguyen, Thuy; Ma, Li; Malphrus, Kimberly G; Palusak, Ryan; Lincoln, Sarah; Bisceglio, Gina; Georgescu, Constantin; Schultz, Debra; Rakhshan, Fariborz; Kolbert, Christopher P; Jen, Jin; Haines, Jonathan L; Mayeux, Richard; Pericak-Vance, Margaret A; Farrer, Lindsay A; Schellenberg, Gerard D; Petersen, Ronald C; Graff-Radford, Neill R; Dickson, Dennis W; Younkin, Steven G; Ertekin-Taner, Nilüfer; Apostolova, Liana G; Arnold, Steven E; Baldwin, Clinton T; Barber, Robert; Barmada, Michael M; Beach, Thomas; Beecham, Gary W; Beekly, Duane; Bennett, David A; Bigio, Eileen H; Bird, Thomas D; Blacker, Deborah; Boeve, Bradley F; Bowen, James D; Boxer, Adam; Burke, James R; Buros, Jacqueline; Buxbaum, Joseph D.

Neurology ; 79(3): 221-8, 2012 Jul 17.

Article in English | MEDLINE | ID: mdl-22722634

ABSTRACT

OBJECTIVE: Recent genome-wide association studies (GWAS) of late-onset Alzheimer disease (LOAD) identified 9 novel risk loci. Discovery of functional variants within genes at these loci is required to confirm their role in Alzheimer disease (AD). Single nucleotide polymorphisms that influence gene expression (eSNPs) constitute an important class of functional variants. We therefore investigated the influence of the novel LOAD risk loci on human brain gene expression. METHODS: We measured gene expression levels in the cerebellum and temporal cortex of autopsied AD subjects and those with other brain pathologies (â¼400 total subjects). To determine whether any of the novel LOAD risk variants are eSNPs, we tested their cis-association with expression of 6 nearby LOAD candidate genes detectable in human brain (ABCA7, BIN1, CLU, MS4A4A, MS4A6A, PICALM) and an additional 13 genes ±100 kb of these SNPs. To identify additional eSNPs that influence brain gene expression levels of the novel candidate LOAD genes, we identified SNPs ±100 kb of their location and tested for cis-associations. RESULTS: CLU rs11136000 (p = 7.81 × 10(-4)) and MS4A4A rs2304933/rs2304935 (p = 1.48 × 10(-4)-1.86 × 10(-4)) significantly influence temporal cortex expression levels of these genes. The LOAD-protective CLU and risky MS4A4A locus alleles associate with higher brain levels of these genes. There are other cis-variants that significantly influence brain expression of CLU and ABCA7 (p = 4.01 × 10(-5)-9.09 × 10(-9)), some of which also associate with AD risk (p = 2.64 × 10(-2)-6.25 × 10(-5)). CONCLUSIONS: CLU and MS4A4A eSNPs may at least partly explain the LOAD risk association at these loci. CLU and ABCA7 may harbor additional strong eSNPs. These results have implications in the search for functional variants at the novel LOAD risk loci.

Subject(s)

Alzheimer Disease/genetics , Brain Chemistry/genetics , Gene Expression/physiology , Aged , Alleles , Apolipoprotein E4/genetics , Autopsy , Female , Gene Dosage , Genetic Predisposition to Disease , Genotype , Humans , Linear Models , Male , Polymorphism, Single Nucleotide , RNA/genetics , RNA/isolation & purification , Risk Factors , Temporal Lobe/metabolism

11.

Genomic association analysis identifies multiple loci influencing antihypertensive response to an angiotensin II receptor blocker.

Turner, Stephen T; Bailey, Kent R; Schwartz, Gary L; Chapman, Arlene B; Chai, High Seng; Boerwinkle, Eric.

Hypertension ; 59(6): 1204-11, 2012 Jun.

Article in English | MEDLINE | ID: mdl-22566498

ABSTRACT

To identify genes influencing blood pressure response to an angiotensin II receptor blocker, single nucleotide polymorphisms identified by genome-wide association analysis of the response to candesartan were validated by opposite direction associations with the response to a thiazide diuretic, hydrochlorothiazide. We sampled 198 white and 193 blacks with primary hypertension from opposite tertiles of the race-sex-specific distributions of age-adjusted diastolic blood pressure response to candesartan. There were 285 polymorphisms associated with the response to candesartan at P<10(-4) in whites. A total of 273 of the 285 polymorphisms, which were available for analysis in a separate sample of 196 whites, validated for opposite direction associations with the response to hydrochlorothiazide (Fisher χ(2) 1-sided P=0.02). Among the 273 polymorphisms, those in the chromosome 11q21 region were the most significantly associated with response to candesartan in whites (eg, rs11020821 near FUT4, P=8.98 × 10(-7)), had the strongest opposite direction associations with response to hydrochlorothiazide (eg, rs3758785 in GPR83, P=7.10 × 10(-3)), and had the same direction associations with response to candesartan in the 193 blacks (eg, rs16924603 near FUT4, P=1.52 × 10(-2)). Also notable among the 273 polymorphisms was rs11649420 on chromosome 16 in the amiloride-sensitive sodium channel subunit SCNN1G involved in mediating renal sodium reabsorption and maintaining blood pressure when the renin-angiotensin system is inhibited by candesartan. These results support the use of genomewide association analyses to identify novel genes predictive of opposite direction associations with blood pressure responses to inhibitors of the renin-angiotensin and renal sodium transport systems.

Subject(s)

Angiotensin II Type 1 Receptor Blockers/therapeutic use , Genome-Wide Association Study/methods , Hypertension/drug therapy , Hypertension/genetics , Polymorphism, Single Nucleotide , Adult , Black or African American/genetics , Benzimidazoles/therapeutic use , Biphenyl Compounds , Blood Pressure/drug effects , Blood Pressure/genetics , Case-Control Studies , Chi-Square Distribution , Chromosomes, Human, Pair 11/genetics , Chromosomes, Human, Pair 16/genetics , Diuretics/therapeutic use , Drug Administration Schedule , Epithelial Sodium Channels/genetics , Female , Genetic Predisposition to Disease/genetics , Genotype , Humans , Hydrochlorothiazide/therapeutic use , Hypertension/ethnology , Male , Middle Aged , Receptors, G-Protein-Coupled/genetics , Renin-Angiotensin System/genetics , Tetrazoles/therapeutic use , White People/genetics

12.

Glutathione S-transferase omega genes in Alzheimer and Parkinson disease risk, age-at-diagnosis and brain gene expression: an association study with mechanistic implications.

Allen, Mariet; Zou, Fanggeng; Chai, High Seng; Younkin, Curtis S; Miles, Richard; Nair, Asha A; Crook, Julia E; Pankratz, V Shane; Carrasquillo, Minerva M; Rowley, Christopher N; Nguyen, Thuy; Ma, Li; Malphrus, Kimberly G; Bisceglio, Gina; Ortolaza, Alexandra I; Palusak, Ryan; Middha, Sumit; Maharjan, Sooraj; Georgescu, Constantin; Schultz, Debra; Rakhshan, Fariborz; Kolbert, Christopher P; Jen, Jin; Sando, Sigrid B; Aasly, Jan O; Barcikowska, Maria; Uitti, Ryan J; Wszolek, Zbigniew K; Ross, Owen A; Petersen, Ronald C; Graff-Radford, Neill R; Dickson, Dennis W; Younkin, Steven G; Ertekin-Taner, Nilüfer.

Mol Neurodegener ; 7: 13, 2012 Apr 11.

Article in English | MEDLINE | ID: mdl-22494505

ABSTRACT

BACKGROUND: Glutathione S-transferase omega-1 and 2 genes (GSTO1, GSTO2), residing within an Alzheimer and Parkinson disease (AD and PD) linkage region, have diverse functions including mitigation of oxidative stress and may underlie the pathophysiology of both diseases. GSTO polymorphisms were previously reported to associate with risk and age-at-onset of these diseases, although inconsistent follow-up study designs make interpretation of results difficult. We assessed two previously reported SNPs, GSTO1 rs4925 and GSTO2 rs156697, in AD (3,493 ADs vs. 4,617 controls) and PD (678 PDs vs. 712 controls) for association with disease risk (case-controls), age-at-diagnosis (cases) and brain gene expression levels (autopsied subjects). RESULTS: We found that rs156697 minor allele associates with significantly increased risk (odds ratio = 1.14, p = 0.038) in the older ADs with age-at-diagnosis > 80 years. The minor allele of GSTO1 rs4925 associates with decreased risk in familial PD (odds ratio = 0.78, p = 0.034). There was no other association with disease risk or age-at-diagnosis. The minor alleles of both GSTO SNPs associate with lower brain levels of GSTO2 (p = 4.7 × 10-11-1.9 × 10-27), but not GSTO1. Pathway analysis of significant genes in our brain expression GWAS, identified significant enrichment for glutathione metabolism genes (p = 0.003). CONCLUSION: These results suggest that GSTO locus variants may lower brain GSTO2 levels and consequently confer AD risk in older age. Other glutathione metabolism genes should be assessed for their effects on AD and other chronic, neurologic diseases.

Subject(s)

Alzheimer Disease/genetics , Genetic Predisposition to Disease , Glutathione Transferase/genetics , Parkinson Disease/genetics , Polymorphism, Single Nucleotide/genetics , Age of Onset , Aged , Aged, 80 and over , Alleles , Alzheimer Disease/enzymology , Follow-Up Studies , Gene Expression , Humans , Middle Aged , Parkinson Disease/enzymology , Risk Factors

13.

Concordance of changes in metabolic pathways based on plasma metabolomics and skeletal muscle transcriptomics in type 1 diabetes.

Dutta, Tumpa; Chai, High Seng; Ward, Lawrence E; Ghosh, Aditya; Persson, Xuan-Mai T; Ford, G Charles; Kudva, Yogish C; Sun, Zhifu; Asmann, Yan W; Kocher, Jean-Pierre A; Nair, K Sreekumaran.

Diabetes ; 61(5): 1004-16, 2012 May.

Article in English | MEDLINE | ID: mdl-22415876

ABSTRACT

Insulin regulates many cellular processes, but the full impact of insulin deficiency on cellular functions remains to be defined. Applying a mass spectrometry-based nontargeted metabolomics approach, we report here alterations of 330 plasma metabolites representing 33 metabolic pathways during an 8-h insulin deprivation in type 1 diabetic individuals. These pathways included those known to be affected by insulin such as glucose, amino acid and lipid metabolism, Krebs cycle, and immune responses and those hitherto unknown to be altered including prostaglandin, arachidonic acid, leukotrienes, neurotransmitters, nucleotides, and anti-inflammatory responses. A significant concordance of metabolome and skeletal muscle transcriptome-based pathways supports an assumption that plasma metabolites are chemical fingerprints of cellular events. Although insulin treatment normalized plasma glucose and many other metabolites, there were 71 metabolites and 24 pathways that differed between nondiabetes and insulin-treated type 1 diabetes. Confirmation of many known pathways altered by insulin using a single blood test offers confidence in the current approach. Future research needs to be focused on newly discovered pathways affected by insulin deficiency and systemic insulin treatment to determine whether they contribute to the high morbidity and mortality in T1D despite insulin treatment.

Subject(s)

Diabetes Mellitus, Type 1/metabolism , Gene Expression Regulation/physiology , Insulin/therapeutic use , Muscle, Skeletal/metabolism , 3-Hydroxybutyric Acid/blood , Adult , Amino Acids/blood , Bicarbonates/blood , Blood Glucose/metabolism , Case-Control Studies , Diabetes Mellitus, Type 1/blood , Diabetes Mellitus, Type 1/drug therapy , Female , Gene Expression Profiling , Glucagon/blood , Glycated Hemoglobin/metabolism , Humans , Insulin/deficiency , Insulin/metabolism , Lipids/blood , Male , Metabolomics , Protein Array Analysis , Signal Transduction , Transcriptome

14.

Impact of data fragmentation across healthcare centers on the accuracy of a high-throughput clinical phenotyping algorithm for specifying subjects with type 2 diabetes mellitus.

Wei, Wei-Qi; Leibson, Cynthia L; Ransom, Jeanine E; Kho, Abel N; Caraballo, Pedro J; Chai, High Seng; Yawn, Barbara P; Pacheco, Jennifer A; Chute, Christopher G.

J Am Med Inform Assoc ; 19(2): 219-24, 2012.

Article in English | MEDLINE | ID: mdl-22249968

ABSTRACT

OBJECTIVE: To evaluate data fragmentation across healthcare centers with regard to the accuracy of a high-throughput clinical phenotyping (HTCP) algorithm developed to differentiate (1) patients with type 2 diabetes mellitus (T2DM) and (2) patients with no diabetes. MATERIALS AND METHODS: This population-based study identified all Olmsted County, Minnesota residents in 2007. We used provider-linked electronic medical record data from the two healthcare centers that provide >95% of all care to County residents (ie, Olmsted Medical Center and Mayo Clinic in Rochester, Minnesota, USA). Subjects were limited to residents with one or more encounter January 1, 2006 through December 31, 2007 at both healthcare centers. DM-relevant data on diagnoses, laboratory results, and medication from both centers were obtained during this period. The algorithm was first executed using data from both centers (ie, the gold standard) and then from Mayo Clinic alone. Positive predictive values and false-negative rates were calculated, and the McNemar test was used to compare categorization when data from the Mayo Clinic alone were used with the gold standard. Age and sex were compared between true-positive and false-negative subjects with T2DM. Statistical significance was accepted as p<0.05. RESULTS: With data from both medical centers, 765 subjects with T2DM (4256 non-DM subjects) were identified. When single-center data were used, 252 T2DM subjects (1573 non-DM subjects) were missed; an additional false-positive 27 T2DM subjects (215 non-DM subjects) were identified. The positive predictive values and false-negative rates were 95.0% (513/540) and 32.9% (252/765), respectively, for T2DM subjects and 92.6% (2683/2898) and 37.0% (1573/4256), respectively, for non-DM subjects. Age and sex distribution differed between true-positive (mean age 62.1; 45% female) and false-negative (mean age 65.0; 56.0% female) T2DM subjects. CONCLUSION: The findings show that application of an HTCP algorithm using data from a single medical center contributes to misclassification. These findings should be considered carefully by researchers when developing and executing HTCP algorithms.

Subject(s)

Algorithms , Diabetes Mellitus, Type 2/diagnosis , Phenotype , Age Distribution , Ambulatory Care Facilities , Diabetes Mellitus, Type 2/classification , Diabetes Mellitus, Type 2/genetics , Diagnostic Errors , Female , Humans , Male , Minnesota , Predictive Value of Tests , Sex Distribution

15.

TREAT: a bioinformatics tool for variant annotations and visualizations in targeted and exome sequencing data.

Asmann, Yan W; Middha, Sumit; Hossain, Asif; Baheti, Saurabh; Li, Ying; Chai, High-Seng; Sun, Zhifu; Duffy, Patrick H; Hadad, Ahmed A; Nair, Asha; Liu, Xiaoyu; Zhang, Yuji; Klee, Eric W; Kalari, Krishna R; Kocher, Jean-Pierre A.

Bioinformatics ; 28(2): 277-8, 2012 Jan 15.

Article in English | MEDLINE | ID: mdl-22088845

ABSTRACT

UNLABELLED: TREAT (Targeted RE-sequencing Annotation Tool) is a tool for facile navigation and mining of the variants from both targeted resequencing and whole exome sequencing. It provides a rich integration of publicly available as well as in-house developed annotations and visualizations for variants, variant-hosting genes and host-gene pathways. AVAILABILITY AND IMPLEMENTATION: TREAT is freely available to non-commercial users as either a stand-alone annotation and visualization tool, or as a comprehensive workflow integrating sequencing alignment and variant calling. The executables, instructions and the Amazon Cloud Images of TREAT can be downloaded at the website: http://ndc.mayo.edu/mayo/research/biostat/stand-alone-packages.cfm.

Subject(s)

Computational Biology/methods , Exome , Molecular Sequence Annotation , Software , Humans , Sequence Alignment

16.

Batch effect correction for genome-wide methylation data with Illumina Infinium platform.

Sun, Zhifu; Chai, High Seng; Wu, Yanhong; White, Wendy M; Donkena, Krishna V; Klein, Christopher J; Garovic, Vesna D; Therneau, Terry M; Kocher, Jean-Pierre A.

BMC Med Genomics ; 4: 84, 2011 Dec 16.

Article in English | MEDLINE | ID: mdl-22171553

ABSTRACT

BACKGROUND: Genome-wide methylation profiling has led to more comprehensive insights into gene regulation mechanisms and potential therapeutic targets. Illumina Human Methylation BeadChip is one of the most commonly used genome-wide methylation platforms. Similar to other microarray experiments, methylation data is susceptible to various technical artifacts, particularly batch effects. To date, little attention has been given to issues related to normalization and batch effect correction for this kind of data. METHODS: We evaluated three common normalization approaches and investigated their performance in batch effect removal using three datasets with different degrees of batch effects generated from HumanMethylation27 platform: quantile normalization at average ß value (QNß); two step quantile normalization at probe signals implemented in "lumi" package of R (lumi); and quantile normalization of A and B signal separately (ABnorm). Subsequent Empirical Bayes (EB) batch adjustment was also evaluated. RESULTS: Each normalization could remove a portion of batch effects and their effectiveness differed depending on the severity of batch effects in a dataset. For the dataset with minor batch effects (Dataset 1), normalization alone appeared adequate and "lumi" showed the best performance. However, all methods left substantial batch effects intact in the datasets with obvious batch effects and further correction was necessary. Without any correction, 50 and 66 percent of CpGs were associated with batch effects in Dataset 2 and 3, respectively. After QNß, lumi or ABnorm, the number of CpGs associated with batch effects were reduced to 24, 32, and 26 percent for Dataset 2; and 37, 46, and 35 percent for Dataset 3, respectively. Additional EB correction effectively removed such remaining non-biological effects. More importantly, the two-step procedure almost tripled the numbers of CpGs associated with the outcome of interest for the two datasets. CONCLUSION: Genome-wide methylation data from Infinium Methylation BeadChip can be susceptible to batch effects with profound impacts on downstream analyses and conclusions. Normalization can reduce part but not all batch effects. EB correction along with normalization is recommended for effective batch effect removal.

Subject(s)

DNA Methylation/genetics , Databases, Genetic , Genome, Human/genetics , Oligonucleotide Array Sequence Analysis/methods , Adult , CpG Islands/genetics , Humans , Male , Reproducibility of Results

17.

Homozygosity mapping and exome sequencing reveal GATAD1 mutation in autosomal recessive dilated cardiomyopathy.

Theis, Jeanne L; Sharpe, Katharine M; Matsumoto, Martha E; Chai, High Seng; Nair, Asha A; Theis, Jason D; de Andrade, Mariza; Wieben, Eric D; Michels, Virginia V; Olson, Timothy M.

Circ Cardiovasc Genet ; 4(6): 585-94, 2011 Dec.

Article in English | MEDLINE | ID: mdl-21965549

ABSTRACT

BACKGROUND: Dilated cardiomyopathy (DCM) is a heritable, genetically heterogeneous disorder that typically exhibits autosomal dominant inheritance. Genomic strategies enable discovery of novel, unsuspected molecular underpinnings of familial DCM. We performed genome-wide mapping and exome sequencing in a unique family wherein DCM segregated as an autosomal recessive (AR) trait. METHODS AND RESULTS: Echocardiography in 17 adult descendants of first cousins revealed DCM in 2 female siblings and idiopathic left ventricular enlargement in their brother. Genotyping and linkage analysis mapped an AR DCM locus to chromosome arm 7q21, which was validated and refined by high-density homozygosity mapping. Exome sequencing of the affected sisters was then used as a complementary strategy for mutation discovery. An iterative bioinformatics process was used to filter >40,000 genetic variants, revealing a single shared homozygous missense mutation localized to the 7q21 critical region. The mutation, absent in HapMap, 1000 Genomes, and 474 ethnically matched controls, altered a conserved residue of GATAD1, encoding GATA zinc finger domain-containing protein 1. Thirteen relatives were heterozygous mutation carriers with no evidence of myocardial disease, even at advanced ages. Immunohistochemistry demonstrated nuclear localization of GATAD1 in left ventricular myocytes, yet subcellular expression and nuclear morphology were aberrant in the proband. CONCLUSIONS: Linkage analysis and exome sequencing were used as synergistic genomic strategies to identify GATAD1 as a gene for AR DCM. GATAD1 binds to a histone modification site that regulates gene expression. Consistent with murine DCM caused by genetic disruption of histone deacetylases, the data implicate an inherited basis for epigenetic dysregulation in human heart failure.

Subject(s)

Cardiomyopathy, Dilated/genetics , Exome , Eye Proteins/genetics , Genes, Recessive , Mutation, Missense , Adolescent , Adult , Aged , Aged, 80 and over , Cardiomyopathy, Dilated/metabolism , Chromosome Mapping , Eye Proteins/metabolism , Female , Genetic Linkage , Homozygote , Humans , Male , Middle Aged , Pedigree , White People , Young Adult

18.

Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies.

Denny, Joshua C; Crawford, Dana C; Ritchie, Marylyn D; Bielinski, Suzette J; Basford, Melissa A; Bradford, Yuki; Chai, High Seng; Bastarache, Lisa; Zuvich, Rebecca; Peissig, Peggy; Carrell, David; Ramirez, Andrea H; Pathak, Jyotishman; Wilke, Russell A; Rasmussen, Luke; Wang, Xiaoming; Pacheco, Jennifer A; Kho, Abel N; Hayes, M Geoffrey; Weston, Noah; Matsumoto, Martha; Kopp, Peter A; Newton, Katherine M; Jarvik, Gail P; Li, Rongling; Manolio, Teri A; Kullo, Iftikhar J; Chute, Christopher G; Chisholm, Rex L; Larson, Eric B; McCarty, Catherine A; Masys, Daniel R; Roden, Dan M; de Andrade, Mariza.

Am J Hum Genet ; 89(4): 529-42, 2011 Oct 07.

Article in English | MEDLINE | ID: mdl-21981779

ABSTRACT

We repurposed existing genotypes in DNA biobanks across the Electronic Medical Records and Genomics network to perform a genome-wide association study for primary hypothyroidism, the most common thyroid disease. Electronic selection algorithms incorporating billing codes, laboratory values, text queries, and medication records identified 1317 cases and 5053 controls of European ancestry within five electronic medical records (EMRs); the algorithms' positive predictive values were 92.4% and 98.5% for cases and controls, respectively. Four single-nucleotide polymorphisms (SNPs) in linkage disequilibrium at 9q22 near FOXE1 were associated with hypothyroidism at genome-wide significance, the strongest being rs7850258 (odds ratio [OR] 0.74, p = 3.96 × 10(-9)). This association was replicated in a set of 263 cases and 1616 controls (OR = 0.60, p = 5.7 × 10(-6)). A phenome-wide association study (PheWAS) that was performed on this locus with 13,617 individuals and more than 200,000 patient-years of billing data identified associations with additional phenotypes: thyroiditis (OR = 0.58, p = 1.4 × 10(-5)), nodular (OR = 0.76, p = 3.1 × 10(-5)) and multinodular (OR = 0.69, p = 3.9 × 10(-5)) goiters, and thyrotoxicosis (OR = 0.76, p = 1.5 × 10(-3)), but not Graves disease (OR = 1.03, p = 0.82). Thyroid cancer, previously associated with this locus, was not significantly associated in the PheWAS (OR = 1.29, p = 0.09). The strongest association in the PheWAS was hypothyroidism (OR = 0.76, p = 2.7 × 10(-13)), which had an odds ratio that was nearly identical to that of the curated case-control population in the primary analysis, providing further validation of the PheWAS method. Our findings indicate that EMR-linked genomic data could allow discovery of genes associated with many diseases without additional genotyping cost.

Subject(s)

Forkhead Transcription Factors/genetics , Hypothyroidism/genetics , Aged , Algorithms , Female , Genetic Markers , Genetic Variation , Genome , Genome-Wide Association Study , Genotype , Humans , Male , Medical Records Systems, Computerized , Middle Aged , Phenotype , Predictive Value of Tests

19.

Mayo Genome Consortia: a genotype-phenotype resource for genome-wide association studies with an application to the analysis of circulating bilirubin levels.

Bielinski, Suzette J; Chai, High Seng; Pathak, Jyotishman; Talwalkar, Jayant A; Limburg, Paul J; Gullerud, Rachel E; Sicotte, Hugues; Klee, Eric W; Ross, Jason L; Kocher, Jean-Pierre A; Kullo, Iftikhar J; Heit, John A; Petersen, Gloria M; de Andrade, Mariza; Chute, Christopher G.

Mayo Clin Proc ; 86(7): 606-14, 2011 Jul.

Article in English | MEDLINE | ID: mdl-21646302

ABSTRACT

OBJECTIVE: To create a cohort for cost-effective genetic research, the Mayo Genome Consortia (MayoGC) has been assembled with participants from research studies across Mayo Clinic with high-throughput genetic data and electronic medical record (EMR) data for phenotype extraction. PARTICIPANTS AND METHODS: Eligible participants include those who gave general research consent in the contributing studies to share high-throughput genotyping data with other investigators. Herein, we describe the design of the MayoGC, including the current participating cohorts, expansion efforts, data processing, and study management and organization. A genome-wide association study to identify genetic variants associated with total bilirubin levels was conducted to test the genetic research capability of the MayoGC. RESULTS: Genome-wide significant results were observed on 2q37 (top single nucleotide polymorphism, rs4148325; P=5.0 × 10(-62)) and 12p12 (top single nucleotide polymorphism, rs4363657; P=5.1 × 10(-8)) corresponding to a gene cluster of uridine 5'-diphospho-glucuronosyltransferases (the UGT1A cluster) and solute carrier organic anion transporter family, member 1B1 (SLCO1B1), respectively. CONCLUSION: Genome-wide association studies have identified genetic variants associated with numerous phenotypes but have been historically limited by inadequate sample size due to costly genotyping and phenotyping. Large consortia with harmonized genotype data have been assembled to attain sufficient statistical power, but phenotyping remains a rate-limiting factor in gene discovery research efforts. The EMR consists of an abundance of phenotype data that can be extracted in a relatively quick and systematic manner. The MayoGC provides a model of a unique collaborative effort in the environment of a common EMR for the investigation of genetic determinants of diseases.

Subject(s)

Bilirubin/blood , Genome-Wide Association Study , Glucuronosyltransferase/genetics , Organic Anion Transporters/genetics , Polymorphism, Genetic/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Bilirubin/genetics , Cohort Studies , Cost-Benefit Analysis , Electronic Health Records , Female , Genome-Wide Association Study/economics , Humans , Liver-Specific Organic Anion Transporter 1 , Male , Middle Aged , Phenotype , Young Adult

20.

A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines.

Asmann, Yan W; Hossain, Asif; Necela, Brian M; Middha, Sumit; Kalari, Krishna R; Sun, Zhifu; Chai, High-Seng; Williamson, David W; Radisky, Derek; Schroth, Gary P; Kocher, Jean-Pierre A; Perez, Edith A; Thompson, E Aubrey.

Nucleic Acids Res ; 39(15): e100, 2011 Aug.

Article in English | MEDLINE | ID: mdl-21622959

ABSTRACT

SnowShoes-FTD, developed for fusion transcript detection in paired-end mRNA-Seq data, employs multiple steps of false positive filtering to nominate fusion transcripts with near 100% confidence. Unique features include: (i) identification of multiple fusion isoforms from two gene partners; (ii) prediction of genomic rearrangements; (iii) identification of exon fusion boundaries; (iv) generation of a 5'-3' fusion spanning sequence for PCR validation; and (v) prediction of the protein sequences, including frame shift and amino acid insertions. We applied SnowShoes-FTD to identify 50 fusion candidates in 22 breast cancer and 9 non-transformed cell lines. Five additional fusion candidates with two isoforms were confirmed. In all, 30 of 55 fusion candidates had in-frame protein products. No fusion transcripts were detected in non-transformed cells. Consideration of the possible functions of a subset of predicted fusion proteins suggests several potentially important functions in transformation, including a possible new mechanism for overexpression of ERBB2 in a HER-positive cell line. The source code of SnowShoes-FTD is provided in two formats: one configured to run on the Sun Grid Engine for parallelization, and the other formatted to run on a single LINUX node. Executables in PERL are available for download from our web site: http://mayoresearch.mayo.edu/mayo/research/biostat/stand-alone-packages.cfm.

Subject(s)

Breast Neoplasms/genetics , Gene Fusion , Mutant Chimeric Proteins/genetics , RNA, Messenger/chemistry , Software , Breast Neoplasms/metabolism , Cell Line , Cell Line, Tumor , Computational Biology/methods , Female , Humans , Mutant Chimeric Proteins/metabolism , Mutation , Promoter Regions, Genetic , RNA, Messenger/analysis , Receptor, ErbB-2/genetics , Receptor, ErbB-2/metabolism , Sequence Alignment , Sequence Analysis, RNA

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL