Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 53
Filter
1.
PLoS One ; 18(5): e0283553, 2023.
Article in English | MEDLINE | ID: mdl-37196047

ABSTRACT

OBJECTIVE: Diverticular disease (DD) is one of the most prevalent conditions encountered by gastroenterologists, affecting ~50% of Americans before the age of 60. Our aim was to identify genetic risk variants and clinical phenotypes associated with DD, leveraging multiple electronic health record (EHR) data sources of 91,166 multi-ancestry participants with a Natural Language Processing (NLP) technique. MATERIALS AND METHODS: We developed a NLP-enriched phenotyping algorithm that incorporated colonoscopy or abdominal imaging reports to identify patients with diverticulosis and diverticulitis from multicenter EHRs. We performed genome-wide association studies (GWAS) of DD in European, African and multi-ancestry participants, followed by phenome-wide association studies (PheWAS) of the risk variants to identify their potential comorbid/pleiotropic effects in clinical phenotypes. RESULTS: Our developed algorithm showed a significant improvement in patient classification performance for DD analysis (algorithm PPVs ≥ 0.94), with up to a 3.5 fold increase in terms of the number of identified patients than the traditional method. Ancestry-stratified analyses of diverticulosis and diverticulitis of the identified subjects replicated the well-established associations between ARHGAP15 loci with DD, showing overall intensified GWAS signals in diverticulitis patients compared to diverticulosis patients. Our PheWAS analyses identified significant associations between the DD GWAS variants and circulatory system, genitourinary, and neoplastic EHR phenotypes. DISCUSSION: As the first multi-ancestry GWAS-PheWAS study, we showcased that heterogenous EHR data can be mapped through an integrative analytical pipeline and reveal significant genotype-phenotype associations with clinical interpretation. CONCLUSION: A systematic framework to process unstructured EHR data with NLP could advance a deep and scalable phenotyping for better patient identification and facilitate etiological investigation of a disease with multilayered data.


Subject(s)
Diverticular Diseases , Diverticulitis , Diverticulum , Humans , Electronic Health Records , Genome-Wide Association Study/methods , Natural Language Processing , Phenotype , Algorithms , Polymorphism, Single Nucleotide
2.
Methods Inf Med ; 61(1-02): 11-18, 2022 05.
Article in English | MEDLINE | ID: mdl-34991173

ABSTRACT

OBJECTIVE: Natural language processing (NLP) systems convert unstructured text into analyzable data. Here, we describe the performance measures of NLP to capture granular details on nodules from thyroid ultrasound (US) reports and reveal critical issues with reporting language. METHODS: We iteratively developed NLP tools using clinical Text Analysis and Knowledge Extraction System (cTAKES) and thyroid US reports from 2007 to 2013. We incorporated nine nodule features for NLP extraction. Next, we evaluated the precision, recall, and accuracy of our NLP tools using a separate set of US reports from an academic medical center (A) and a regional health care system (B) during the same period. Two physicians manually annotated each test-set report. A third physician then adjudicated discrepancies. The adjudicated "gold standard" was then used to evaluate NLP performance on the test-set. RESULTS: A total of 243 thyroid US reports contained 6,405 data elements. Inter-annotator agreement for all elements was 91.3%. Compared with the gold standard, overall recall of the NLP tool was 90%. NLP recall for thyroid lobe or isthmus characteristics was: laterality 96% and size 95%. NLP accuracy for nodule characteristics was: laterality 92%, size 92%, calcifications 76%, vascularity 65%, echogenicity 62%, contents 76%, and borders 40%. NLP recall for presence or absence of lymphadenopathy was 61%. Reporting style accounted for 18% errors. For example, the word "heterogeneous" interchangeably referred to nodule contents or echogenicity. While nodule dimensions and laterality were often described, US reports only described contents, echogenicity, vascularity, calcifications, borders, and lymphadenopathy, 46, 41, 17, 15, 9, and 41% of the time, respectively. Most nodule characteristics were equally likely to be described at hospital A compared with hospital B. CONCLUSIONS: NLP can automate extraction of critical information from thyroid US reports. However, ambiguous and incomplete reporting language hinders performance of NLP systems regardless of institutional setting. Standardized or synoptic thyroid US reports could improve NLP performance.


Subject(s)
Lymphadenopathy , Natural Language Processing , Humans , Thyroid Gland/diagnostic imaging
3.
BMC Med Inform Decis Mak ; 22(1): 23, 2022 01 28.
Article in English | MEDLINE | ID: mdl-35090449

ABSTRACT

INTRODUCTION: Currently, one of the commonly used methods for disseminating electronic health record (EHR)-based phenotype algorithms is providing a narrative description of the algorithm logic, often accompanied by flowcharts. A challenge with this mode of dissemination is the potential for under-specification in the algorithm definition, which leads to ambiguity and vagueness. METHODS: This study examines incidents of under-specification that occurred during the implementation of 34 narrative phenotyping algorithms in the electronic Medical Record and Genomics (eMERGE) network. We reviewed the online communication history between algorithm developers and implementers within the Phenotype Knowledge Base (PheKB) platform, where questions could be raised and answered regarding the intended implementation of a phenotype algorithm. RESULTS: We developed a taxonomy of under-specification categories via an iterative review process between two groups of annotators. Under-specifications that lead to ambiguity and vagueness were consistently found across narrative phenotype algorithms developed by all involved eMERGE sites. DISCUSSION AND CONCLUSION: Our findings highlight that under-specification is an impediment to the accuracy and efficiency of the implementation of current narrative phenotyping algorithms, and we propose approaches for mitigating these issues and improved methods for disseminating EHR phenotyping algorithms.


Subject(s)
Algorithms , Electronic Health Records , Genomics , Humans , Knowledge Bases , Phenotype
4.
NPJ Digit Med ; 4(1): 70, 2021 Apr 13.
Article in English | MEDLINE | ID: mdl-33850243

ABSTRACT

Chronic Kidney Disease (CKD) represents a slowly progressive disorder that is typically silent until late stages, but early intervention can significantly delay its progression. We designed a portable and scalable electronic CKD phenotype to facilitate early disease recognition and empower large-scale observational and genetic studies of kidney traits. The algorithm uses a combination of rule-based and machine-learning methods to automatically place patients on the staging grid of albuminuria by glomerular filtration rate ("A-by-G" grid). We manually validated the algorithm by 451 chart reviews across three medical systems, demonstrating overall positive predictive value of 95% for CKD cases and 97% for healthy controls. Independent case-control validation using 2350 patient records demonstrated diagnostic specificity of 97% and sensitivity of 87%. Application of the phenotype to 1.3 million patients demonstrated that over 80% of CKD cases are undetected using ICD codes alone. We also demonstrated several large-scale applications of the phenotype, including identifying stage-specific kidney disease comorbidities, in silico estimation of kidney trait heritability in thousands of pedigrees reconstructed from medical records, and biobank-based multicenter genome-wide and phenome-wide association studies.

5.
AMIA Jt Summits Transl Sci Proc ; 2019: 572-581, 2019.
Article in English | MEDLINE | ID: mdl-31259012

ABSTRACT

Epidemiological studies identifying biological markers of disease state are valuable, but can be time-consuming, expensive, and require extensive intuition and expertise. Furthermore, not all hypothesized markers will be borne out in a study, suggesting that higher quality initial hypotheses are crucial. In this work, we propose a high-throughput pipeline to produce a ranked list of high-quality hypothesized marker laboratory tests for diagnoses. Our pipeline generates a large number of candidate lab-diagnosis hypotheses derived from machine learning models, filters and ranks them according to their potential novelty using text mining, and corroborate final hypotheses with logistic regression analysis. We test our approach on a large electronic health record dataset and the PubMed corpus, and find several promising candidate hypotheses.

6.
J Biomed Inform ; 96: 103253, 2019 08.
Article in English | MEDLINE | ID: mdl-31325501

ABSTRACT

BACKGROUND: Implementing clinical phenotypes across a network is labor intensive and potentially error prone. Use of a common data model may facilitate the process. METHODS: Electronic Medical Records and Genomics (eMERGE) sites implemented the Observational Health Data Sciences and Informatics (OHDSI) Observational Medical Outcomes Partnership (OMOP) Common Data Model across their electronic health record (EHR)-linked DNA biobanks. Two previously implemented eMERGE phenotypes were converted to OMOP and implemented across the network. RESULTS: It was feasible to implement the common data model across sites, with laboratory data producing the greatest challenge due to local encoding. Sites were then able to execute the OMOP phenotype in less than one day, as opposed to weeks of effort to manually implement an eMERGE phenotype in their bespoke research EHR databases. Of the sites that could compare the current OMOP phenotype implementation with the original eMERGE phenotype implementation, specific agreement ranged from 100% to 43%, with disagreements due to the original phenotype, the OMOP phenotype, changes in data, and issues in the databases. Using the OMOP query as a standard comparison revealed differences in the original implementations despite starting from the same definitions, code lists, flowcharts, and pseudocode. CONCLUSION: Using a common data model can dramatically speed phenotype implementation at the cost of having to populate that data model, though this will produce a net benefit as the number of phenotype implementations increases. Inconsistencies among the implementations of the original queries point to a potential benefit of using a common data model so that actual phenotype code and logic can be shared, mitigating human error in reinterpretation of a narrative phenotype definition.


Subject(s)
Attention Deficit Disorder with Hyperactivity/diagnosis , Databases, Factual , Diabetes Mellitus, Type 2/diagnosis , Electronic Health Records , Data Collection , Humans , Medical Informatics , National Human Genome Research Institute (U.S.) , Observational Studies as Topic , Outcome Assessment, Health Care , Phenotype , Research Design , Software , United States
7.
Front Genet ; 10: 511, 2019.
Article in English | MEDLINE | ID: mdl-31249589

ABSTRACT

Uterine fibroids affect up to 77% of women by menopause and account for up to $34 billion in healthcare costs each year. Although fibroid risk is heritable, genetic risk for fibroids is not well understood. We conducted a two-stage case-control meta-analysis of genetic variants in European and African ancestry women with and without fibroids classified by a previously published algorithm requiring pelvic imaging or confirmed diagnosis. Women from seven electronic Medical Records and Genomics (eMERGE) network sites (3,704 imaging-confirmed cases and 5,591 imaging-confirmed controls) and women of African and European ancestry from UK Biobank (UKB, 5,772 cases and 61,457 controls) were included in the discovery genome-wide association study (GWAS) meta-analysis. Variants showing evidence of association in Stage I GWAS (P < 1 × 10-5) were targeted in an independent replication sample of African and European ancestry individuals from the UKB (Stage II) (12,358 cases and 138,477 controls). Logistic regression models were fit with genetic markers imputed to a 1000 Genomes reference and adjusted for principal components for each race- and site-specific dataset, followed by fixed-effects meta-analysis. Final analysis with 21,804 cases and 205,525 controls identified 326 genome-wide significant variants in 11 loci, with three novel loci at chromosome 1q24 (sentinel-SNP rs14361789; P = 4.7 × 10-8), chromosome 16q12.1 (sentinel-SNP rs4785384; P = 1.5 × 10-9) and chromosome 20q13.1 (sentinel-SNP rs6094982; P = 2.6 × 10-8). Our statistically significant findings further support previously reported loci including SNPs near WT1, TNRC6B, SYNE1, BET1L, and CDC42/WNT4. We report evidence of ancestry-specific findings for sentinel-SNP rs10917151 in the CDC42/WNT4 locus (P = 1.76 × 10-24). Ancestry-specific effect-estimates for rs10917151 were in opposite directions (P-Het-between-groups = 0.04) for predominantly African (OR = 0.84) and predominantly European women (OR = 1.16). Genetically-predicted gene expression of several genes including LUZP1 in vagina (P = 4.6 × 10-8), OBFC1 in esophageal mucosa (P = 8.7 × 10-8), NUDT13 in multiple tissues including subcutaneous adipose tissue (P = 3.3 × 10-6), and HEATR3 in skeletal muscle tissue (P = 5.8 × 10-6) were associated with fibroids. The finding for HEATR3 was supported by SNP-based summary Mendelian randomization analysis. Our study suggests that fibroid risk variants act through regulatory mechanisms affecting gene expression and are comprised of alleles that are both ancestry-specific and shared across continental ancestries.

8.
Sci Rep ; 9(1): 6077, 2019 04 15.
Article in English | MEDLINE | ID: mdl-30988330

ABSTRACT

Benign prostatic hyperplasia (BPH) results in a significant public health burden due to the morbidity caused by the disease and many of the available remedies. As much as 70% of men over 70 will develop BPH. Few studies have been conducted to discover the genetic determinants of BPH risk. Understanding the biological basis for this condition may provide necessary insight for development of novel pharmaceutical therapies or risk prediction. We have evaluated SNP-based heritability of BPH in two cohorts and conducted a genome-wide association study (GWAS) of BPH risk using 2,656 cases and 7,763 controls identified from the Electronic Medical Records and Genomics (eMERGE) network. SNP-based heritability estimates suggest that roughly 60% of the phenotypic variation in BPH is accounted for by genetic factors. We used logistic regression to model BPH risk as a function of principal components of ancestry, age, and imputed genotype data, with meta-analysis performed using METAL. The top result was on chromosome 22 in SYN3 at rs2710383 (p-value = 4.6 × 10-7; Odds Ratio = 0.69, 95% confidence interval = 0.55-0.83). Other suggestive signals were near genes GLGC, UNCA13, SORCS1 and between BTBD3 and SPTLC3. We also evaluated genetically-predicted gene expression in prostate tissue. The most significant result was with increasing predicted expression of ETV4 (chr17; p-value = 0.0015). Overexpression of this gene has been associated with poor prognosis in prostate cancer. In conclusion, although there were no genome-wide significant variants identified for BPH susceptibility, we present evidence supporting the heritability of this phenotype, have identified suggestive signals, and evaluated the association between BPH and genetically-predicted gene expression in prostate.


Subject(s)
Genetic Predisposition to Disease , Inheritance Patterns , Prostatic Hyperplasia/genetics , Aged , Aged, 80 and over , Biomarkers/metabolism , Case-Control Studies , Electronic Health Records/statistics & numerical data , Gene Expression Profiling , Genome-Wide Association Study , Genotyping Techniques , Humans , Male , Middle Aged , Polymorphism, Single Nucleotide , Prostate/pathology , Prostatic Hyperplasia/epidemiology , Prostatic Hyperplasia/pathology
10.
Genes Immun ; 20(7): 555-565, 2019 09.
Article in English | MEDLINE | ID: mdl-30459343

ABSTRACT

Resting-state white blood cell (WBC) count is a marker of inflammation and immune system health. There is evidence that WBC count is not fixed over time and there is heterogeneity in WBC trajectory that is associated with morbidity and mortality. Latent class mixed modeling (LCMM) is a method that can identify unobserved heterogeneity in longitudinal data and attempts to classify individuals into groups based on a linear model of repeated measurements. We applied LCMM to repeated WBC count measures derived from electronic medical records of participants of the National Human Genetics Research Institute (NHRGI) electronic MEdical Record and GEnomics (eMERGE) network study, revealing two WBC count trajectory phenotypes. Advancing these phenotypes to GWAS, we found genetic associations between trajectory class membership and regions on chromosome 1p34.3 and chromosome 11q13.4. The chromosome 1 region contains CSF3R, which encodes the granulocyte colony-stimulating factor receptor. This protein is a major factor in neutrophil stimulation and proliferation. The association on chromosome 11 contain genes RNF169 and XRRA1; both involved in the regulation of double-strand break DNA repair.


Subject(s)
Leukocyte Count/methods , Leukocytes/classification , Adult , Aged , Databases, Genetic , Electronic Health Records , Female , Genome-Wide Association Study , Humans , Latent Class Analysis , Male , Middle Aged , Phenotype , Polymorphism, Single Nucleotide/genetics , Proteins/genetics , Receptors, Colony-Stimulating Factor/genetics , Ubiquitin-Protein Ligases/genetics
11.
J Am Med Inform Assoc ; 26(2): 143-148, 2019 02 01.
Article in English | MEDLINE | ID: mdl-30590574

ABSTRACT

To better understand the real-world effects of pharmacogenomic (PGx) alerts, this study aimed to characterize alert design within the eMERGE Network, and to establish a method for sharing PGx alert response data for aggregate analysis. Seven eMERGE sites submitted design details and established an alert logging data dictionary. Six sites participated in a pilot study, sharing alert response data from their electronic health record systems. PGx alert design varied, with some consensus around the use of active, post-test alerts to convey Clinical Pharmacogenetics Implementation Consortium recommendations. Sites successfully shared response data, with wide variation in acceptance and follow rates. Results reflect the lack of standardization in PGx alert design. Standards and/or larger studies will be necessary to fully understand PGx impact. This study demonstrated a method for sharing PGx alert response data and established that variation in system design is a significant barrier for multi-site analyses.


Subject(s)
Data Aggregation , Decision Support Systems, Clinical , Drug Prescriptions , Electronic Health Records , Medical Order Entry Systems , Pharmacogenetics , Feasibility Studies , Humans , Pilot Projects , Precision Medicine
12.
Circulation ; 138(22): 2469-2481, 2018 11 27.
Article in English | MEDLINE | ID: mdl-30571344

ABSTRACT

BACKGROUND: Proteomic approaches allow measurement of thousands of proteins in a single specimen, which can accelerate biomarker discovery. However, applying these technologies to massive biobanks is not currently feasible because of the practical barriers and costs of implementing such assays at scale. To overcome these challenges, we used a "virtual proteomic" approach, linking genetically predicted protein levels to clinical diagnoses in >40 000 individuals. METHODS: We used genome-wide association data from the Framingham Heart Study (n=759) to construct genetic predictors for 1129 plasma protein levels. We validated the genetic predictors for 268 proteins and used them to compute predicted protein levels in 41 288 genotyped individuals in the Electronic Medical Records and Genomics (eMERGE) cohort. We tested associations for each predicted protein with 1128 clinical phenotypes. Lead associations were validated with directly measured protein levels and either low-density lipoprotein cholesterol or subclinical atherosclerosis in the MDCS (Malmö Diet and Cancer Study; n=651). RESULTS: In the virtual proteomic analysis in eMERGE, 55 proteins were associated with 89 distinct diagnoses at a false discovery rate q<0.1. Among these, 13 associations involved lipid (n=7) or atherosclerosis (n=6) phenotypes. We tested each association for validation in MDCS using directly measured protein levels. At Bonferroni-adjusted significance thresholds, levels of apolipoprotein E isoforms were associated with hyperlipidemia, and circulating C-type lectin domain family 1 member B and platelet-derived growth factor receptor-ß predicted subclinical atherosclerosis. Odds ratios for carotid atherosclerosis were 1.31 (95% CI, 1.08-1.58; P=0.006) per 1-SD increment in C-type lectin domain family 1 member B and 0.79 (0.66-0.94; P=0.008) per 1-SD increment in platelet-derived growth factor receptor-ß. CONCLUSIONS: We demonstrate a biomarker discovery paradigm to identify candidate biomarkers of cardiovascular and other diseases.


Subject(s)
Biomarkers/blood , Carotid Artery Diseases/diagnosis , Genome-Wide Association Study , Proteome/analysis , Adult , Aged , Aged, 80 and over , Carotid Artery Diseases/genetics , Female , Genotype , Humans , Lectins, C-Type/analysis , Male , Middle Aged , Odds Ratio , Phenotype , Polymorphism, Single Nucleotide , Proteomics , Receptor, Platelet-Derived Growth Factor beta/blood
14.
Nat Commun ; 9(1): 3522, 2018 08 30.
Article in English | MEDLINE | ID: mdl-30166544

ABSTRACT

Defining the full spectrum of human disease associated with a biomarker is necessary to advance the biomarker into clinical practice. We hypothesize that associating biomarker measurements with electronic health record (EHR) populations based on shared genetic architectures would establish the clinical epidemiology of the biomarker. We use Bayesian sparse linear mixed modeling to calculate SNP weightings for 53 biomarkers from the Atherosclerosis Risk in Communities study. We use the SNP weightings to computed predicted biomarker values in an EHR population and test associations with 1139 diagnoses. Here we report 116 associations meeting a Bonferroni level of significance. A false discovery rate (FDR)-based significance threshold reveals more known and undescribed associations across a broad range of biomarkers, including biometric measures, plasma proteins and metabolites, functional assays, and behaviors. We confirm an inverse association between LDL-cholesterol level and septicemia risk in an independent epidemiological cohort. This approach efficiently discovers biomarker-disease associations.


Subject(s)
Biomarkers/analysis , Electronic Health Records , Genome-Wide Association Study/methods , Bayes Theorem , Biomarkers/blood , Cholesterol, LDL/blood , Humans , Prospective Studies , Risk Factors
15.
J Am Med Inform Assoc ; 25(11): 1540-1546, 2018 11 01.
Article in English | MEDLINE | ID: mdl-30124903

ABSTRACT

Electronic health record (EHR) algorithms for defining patient cohorts are commonly shared as free-text descriptions that require human intervention both to interpret and implement. We developed the Phenotype Execution and Modeling Architecture (PhEMA, http://projectphema.org) to author and execute standardized computable phenotype algorithms. With PhEMA, we converted an algorithm for benign prostatic hyperplasia, developed for the electronic Medical Records and Genomics network (eMERGE), into a standards-based computable format. Eight sites (7 within eMERGE) received the computable algorithm, and 6 successfully executed it against local data warehouses and/or i2b2 instances. Blinded random chart review of cases selected by the computable algorithm shows PPV ≥90%, and 3 out of 5 sites had >90% overlap of selected cases when comparing the computable algorithm to their original eMERGE implementation. This case study demonstrates potential use of PhEMA computable representations to automate phenotyping across different EHR systems, but also highlights some ongoing challenges.


Subject(s)
Algorithms , Electronic Health Records , Phenotype , Prostatic Hyperplasia/diagnosis , Data Warehousing , Databases, Factual , Genomics , Humans , Male , Organizational Case Studies , Prostatic Hyperplasia/genetics
16.
AMIA Jt Summits Transl Sci Proc ; 2017: 139-146, 2018.
Article in English | MEDLINE | ID: mdl-29888059

ABSTRACT

Calciphylaxis is a disorder that results in necrotic cutaneous lesions with a high rate of mortality. Due to its rarity and complexity, the risk factors for and the disease mechanism of calciphylaxis are not fully understood. This work focuses on the use of machine learning to both predict disease risk and model the contributing factors learned from an electronic health record data set. We present the results of four modeling approaches on several subpopulations of patients with chronic kidney disease (CKD). We find that modeling calciphylaxis risk with random forests learned from binary feature data produces strong models, and in the case of predicting calciphylaxis development among stage 4 CKD patients, we achieve an AUC-ROC of 0.8718. This ability to successfully predict calciphylaxis may provide an excellent opportunity for clinical translation of the predictive models presented in this paper.

17.
Circulation ; 138(17): 1839-1849, 2018 10 23.
Article in English | MEDLINE | ID: mdl-29703846

ABSTRACT

BACKGROUND: Coronary heart disease (CHD) is a leading cause of death globally. Although therapy with statins decreases circulating levels of low-density lipoprotein cholesterol and the incidence of CHD, additional events occur despite statin therapy in some individuals. The genetic determinants of this residual cardiovascular risk remain unknown. METHODS: We performed a 2-stage genome-wide association study of CHD events during statin therapy. We first identified 3099 cases who experienced CHD events (defined as acute myocardial infarction or the need for coronary revascularization) during statin therapy and 7681 controls without CHD events during comparable intensity and duration of statin therapy from 4 sites in the Electronic Medical Records and Genomics Network. We then sought replication of candidate variants in another 160 cases and 1112 controls from a fifth Electronic Medical Records and Genomics site, which joined the network after the initial genome-wide association study. Finally, we performed a phenome-wide association study for other traits linked to the most significant locus. RESULTS: The meta-analysis identified 7 single nucleotide polymorphisms at a genome-wide level of significance within the LPA/PLG locus associated with CHD events on statin treatment. The most significant association was for an intronic single nucleotide polymorphism within LPA/PLG (rs10455872; minor allele frequency, 0.069; odds ratio, 1.58; 95% confidence interval, 1.35-1.86; P=2.6×10-10). In the replication cohort, rs10455872 was also associated with CHD events (odds ratio, 1.71; 95% confidence interval, 1.14-2.57; P=0.009). The association of this single nucleotide polymorphism with CHD events was independent of statin-induced change in low-density lipoprotein cholesterol (odds ratio, 1.62; 95% confidence interval, 1.17-2.24; P=0.004) and persisted in individuals with low-density lipoprotein cholesterol ≤70 mg/dL (odds ratio, 2.43; 95% confidence interval, 1.18-4.75; P=0.015). A phenome-wide association study supported the effect of this region on coronary heart disease and did not identify noncardiovascular phenotypes. CONCLUSIONS: Genetic variations at the LPA locus are associated with CHD events during statin therapy independently of the extent of low-density lipoprotein cholesterol lowering. This finding provides support for exploring strategies targeting circulating concentrations of lipoprotein(a) to reduce CHD events in patients receiving statins.


Subject(s)
Coronary Disease/genetics , Coronary Disease/prevention & control , Dyslipidemias/drug therapy , Dyslipidemias/genetics , Hydroxymethylglutaryl-CoA Reductase Inhibitors/therapeutic use , Lipoprotein(a)/genetics , Polymorphism, Single Nucleotide , Case-Control Studies , Coronary Disease/blood , Coronary Disease/diagnosis , Databases, Genetic , Dyslipidemias/blood , Dyslipidemias/diagnosis , Electronic Health Records , Gene Frequency , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Hydroxymethylglutaryl-CoA Reductase Inhibitors/adverse effects , Phenotype , Risk Assessment , Risk Factors , Time Factors , Treatment Outcome
18.
Drug Saf ; 41(4): 363-376, 2018 04.
Article in English | MEDLINE | ID: mdl-29196989

ABSTRACT

INTRODUCTION: Several different types of drugs acting on the central nervous system (CNS) have previously been associated with an increased risk of suicide and suicidal ideation (broadly referred to as suicide). However, a differential association between brand and generic CNS drugs and suicide has not been reported. OBJECTIVES: This study compares suicide adverse event rates for brand versus generic CNS drugs using multiple sources of data. METHODS: Selected examples of CNS drugs (sertraline, gabapentin, zolpidem, and methylphenidate) were evaluated via the US FDA Adverse Event Reporting System (FAERS) for a hypothesis-generating study, and then via administrative claims and electronic health record (EHR) data for a more rigorous retrospective cohort study. Disproportionality analyses with reporting odds ratios and 95% confidence intervals (CIs) were used in the FAERS analyses to quantify the association between each drug and reported suicide. For the cohort studies, Cox proportional hazards models were used, controlling for demographic and clinical characteristics as well as the background risk of suicide in the insured population. RESULTS: The FAERS analyses found significantly lower suicide reporting rates for brands compared with generics for all four studied products (Breslow-Day P < 0.05). In the claims- and EHR-based cohort study, the adjusted hazard ratio (HR) was statistically significant only for sertraline (HR 0.58; 95% CI 0.38-0.88). CONCLUSION: Suicide reporting rates were disproportionately larger for generic than for brand CNS drugs in FAERS and adjusted retrospective cohort analyses remained significant only for sertraline. However, even for sertraline, temporal confounding related to the close proximity of black box warnings and generic availability is possible. Additional analyses in larger data sources with additional drugs are needed.


Subject(s)
Central Nervous System Agents/adverse effects , Drug-Related Side Effects and Adverse Reactions/etiology , Drugs, Generic/adverse effects , Adolescent , Adult , Adverse Drug Reaction Reporting Systems , Aged , Electronic Health Records , Female , Humans , Male , Middle Aged , Odds Ratio , Proportional Hazards Models , Retrospective Studies , Suicidal Ideation , Suicide , United States , United States Food and Drug Administration , Young Adult
19.
Nat Commun ; 8(1): 1167, 2017 10 27.
Article in English | MEDLINE | ID: mdl-29079728

ABSTRACT

Genome-wide, imputed, sequence, and structural data are now available for exceedingly large sample sizes. The needs for data management, handling population structure and related samples, and performing associations have largely been met. However, the infrastructure to support analyses involving complexity beyond genome-wide association studies is not standardized or centralized. We provide the PLatform for the Analysis, Translation, and Organization of large-scale data (PLATO), a software tool equipped to handle multi-omic data for hundreds of thousands of samples to explore complexity using genetic interactions, environment-wide association studies and gene-environment interactions, phenome-wide association studies, as well as copy number and rare variant analyses. Using the data from the Marshfield Personalized Medicine Research Project, a site in the electronic Medical Records and Genomics Network, we apply each feature of PLATO to type 2 diabetes and demonstrate how PLATO can be used to uncover the complex etiology of common traits.


Subject(s)
Computational Biology , Genome, Human , Genome-Wide Association Study , Alcohol Drinking , Alleles , Databases, Genetic , Diabetes Mellitus, Type 2/genetics , Diet , Epistasis, Genetic , Gene Deletion , Gene Dosage , Gene-Environment Interaction , Genomics , Genotype , Glutamate Decarboxylase/genetics , Humans , Models, Genetic , Phenotype , Polymorphism, Single Nucleotide , Programming Languages , Recurrence , Sequence Analysis, DNA , Software , Surveys and Questionnaires
20.
Clin Drug Investig ; 37(12): 1143-1152, 2017 Dec.
Article in English | MEDLINE | ID: mdl-28933038

ABSTRACT

BACKGROUND: The US Food and Drug Administration Adverse Event Reporting System (FAERS), a post-marketing safety database, can be used to differentiate brand versus generic safety signals. OBJECTIVE: To explore the methods for identifying and analyzing brand versus generic adverse event (AE) reports. METHODS: Public release FAERS data from January 2004 to March 2015 were analyzed using alendronate and carbamazepine as examples. Reports were classified as brand, generic, and authorized generic (AG). Disproportionality analyses compared reporting odds ratios (RORs) of selected known labeled serious adverse events stratifying by brand, generic, and AG. The homogeneity of these RORs was compared using the Breslow-Day test. The AG versus generic was the primary focus since the AG is identical to brand but marketed as a generic, therefore minimizing generic perception bias. Sensitivity analyses explored how methodological approach influenced results. RESULTS: Based on 17,521 US event reports involving alendronate and 3733 US event reports involving carbamazepine (immediate and extended release), no consistently significant differences were observed across RORs for the AGs versus generics. Similar results were obtained when comparing reporting patterns over all time and just after generic entry. The most restrictive approach for classifying AE reports yielded smaller report counts but similar results. CONCLUSION: Differentiation of FAERS reports as brand versus generic requires careful attention to risk of product misclassification, but the relative stability of findings across varying assumptions supports the utility of these approaches for potential signal detection.


Subject(s)
Adverse Drug Reaction Reporting Systems , Drug-Related Side Effects and Adverse Reactions/epidemiology , Drugs, Generic/adverse effects , Databases, Factual , Humans , Odds Ratio , Retrospective Studies , United States , United States Food and Drug Administration
SELECTION OF CITATIONS
SEARCH DETAIL
...