Pesquisa | Portal Regional da BVS

Development and Validation of an Electronic Medical Record Algorithm to Identify Phenotypes of Rotator Cuff Tear.

Gao, Chan; Fan, Run; Ayers, Gregory D; Giri, Ayush; Harris, Kindred; Atreya, Ravi; Teixeira, Pedro L; Jain, Nitin B.

PM R ; 12(11): 1099-1105, 2020 11.

Artigo em Inglês | MEDLINE | ID: mdl-32198840

RESUMO

BACKGROUND: A lack of studies with large sample sizes of patients with rotator cuff tears is a barrier to performing clinical and genomic research. OBJECTIVE: To develop and validate an electronic medical record (EMR)-based algorithm to identify individuals with and without rotator cuff tear. DESIGN: We used a deidentified version of the EMR of more than 2 million subjects. A screening algorithm was applied to classify subjects into likely rotator cuff tear and likely normal rotator cuff groups. From these subjects, 500 likely rotator cuff tear and 500 likely normal rotator cuff were randomly chosen for algorithm development. Chart review of all 1000 subjects confirmed the true phenotype of rotator cuff tear or normal rotator cuff based on magnetic resonance imaging and operative report. An algorithm was then developed based on logistic regression and validation of the algorithm was performed. RESULTS: The variables significantly predicting rotator cuff tear included the number of times a Current Procedural Terminology code related to rotator cuff procedures was used (odds ratio [OR] = 3.3; 95% confidence interval [CI]: 1.6-6.8 for ≥3 vs 0), the number of times a term related to rotator cuff lesions occurred in radiology reports (OR = 2.2; 95% CI: 1.2-4.1 for ≥1 vs 0), and the number of times a term related to rotator cuff lesions occurred in physician notes (OR = 4.5; 95% CI: 2.2-9.1 for 1 or 2 times vs 0). This phenotyping algorithm had a specificity of 0.89 (95% CI: 0.79-0.95) for rotator cuff tear, area under the curve (AUC) of 0.842, and diagnostic likelihood ratios (DLRs), DLR+ and DLR- of 5.94 (95% CI: 3.07-11.48) and 0.363 (95% CI: 0.291-0.453). CONCLUSION: Our informatics algorithm enables identification of cohorts of individuals with and without rotator cuff tear from an EMR-based data set with moderate accuracy.

Assuntos

Lesões do Manguito Rotador , Manguito Rotador , Algoritmos , Registros Eletrônicos de Saúde , Feminino , Humanos , Imageamento por Ressonância Magnética , Masculino , Fenótipo , Lesões do Manguito Rotador/diagnóstico

Genetic Interactions with Age, Sex, Body Mass Index, and Hypertension in Relation to Atrial Fibrillation: The AFGen Consortium.

Weng, Lu-Chen; Lunetta, Kathryn L; Müller-Nurasyid, Martina; Smith, Albert Vernon; Thériault, Sébastien; Weeke, Peter E; Barnard, John; Bis, Joshua C; Lyytikäinen, Leo-Pekka; Kleber, Marcus E; Martinsson, Andreas; Lin, Henry J; Rienstra, Michiel; Trompet, Stella; Krijthe, Bouwe P; Dörr, Marcus; Klarin, Derek; Chasman, Daniel I; Sinner, Moritz F; Waldenberger, Melanie; Launer, Lenore J; Harris, Tamara B; Soliman, Elsayed Z; Alonso, Alvaro; Paré, Guillaume; Teixeira, Pedro L; Denny, Joshua C; Shoemaker, M Benjamin; Van Wagoner, David R; Smith, Jonathan D; Psaty, Bruce M; Sotoodehnia, Nona; Taylor, Kent D; Kähönen, Mika; Nikus, Kjell; Delgado, Graciela E; Melander, Olle; Engström, Gunnar; Yao, Jie; Guo, Xiuqing; Christophersen, Ingrid E; Ellinor, Patrick T; Geelhoed, Bastiaan; Verweij, Niek; Macfarlane, Peter; Ford, Ian; Heeringa, Jan; Franco, Oscar H; Uitterlinden, André G; Völker, Uwe.

Sci Rep ; 7(1): 11303, 2017 09 12.

Artigo em Inglês | MEDLINE | ID: mdl-28900195

RESUMO

It is unclear whether genetic markers interact with risk factors to influence atrial fibrillation (AF) risk. We performed genome-wide interaction analyses between genetic variants and age, sex, hypertension, and body mass index in the AFGen Consortium. Study-specific results were combined using meta-analysis (88,383 individuals of European descent, including 7,292 with AF). Variants with nominal interaction associations in the discovery analysis were tested for association in four independent studies (131,441 individuals, including 5,722 with AF). In the discovery analysis, the AF risk associated with the minor rs6817105 allele (at the PITX2 locus) was greater among subjects ≤ 65 years of age than among those > 65 years (interaction p-value = 4.0 × 10-5). The interaction p-value exceeded genome-wide significance in combined discovery and replication analyses (interaction p-value = 1.7 × 10-8). We observed one genome-wide significant interaction with body mass index and several suggestive interactions with age, sex, and body mass index in the discovery analysis. However, none was replicated in the independent sample. Our findings suggest that the pathogenesis of AF may differ according to age in individuals of European descent, but we did not observe evidence of statistically significant genetic interactions with sex, body mass index, or hypertension on AF risk.

Assuntos

Fibrilação Atrial/genética , Índice de Massa Corporal , Epistasia Genética , Predisposição Genética para Doença , Hipertensão/genética , Caracteres Sexuais , Fatores Etários , Idoso , Cromossomos Humanos Par 4/genética , Feminino , Loci Gênicos , Estudo de Associação Genômica Ampla , Humanos , Masculino , Pessoa de Meia-Idade , Razão de Chances , Polimorfismo de Nucleotídeo Único/genética , Reprodutibilidade dos Testes , Fatores de Risco

Antibacterial photosensitization through activation of coproporphyrinogen oxidase.

Surdel, Matthew C; Horvath, Dennis J; Lojek, Lisa J; Fullen, Audra R; Simpson, Jocelyn; Dutter, Brendan F; Salleng, Kenneth J; Ford, Jeremy B; Jenkins, J Logan; Nagarajan, Raju; Teixeira, Pedro L; Albertolle, Matthew; Georgiev, Ivelin S; Jansen, E Duco; Sulikowski, Gary A; Lacy, D Borden; Dailey, Harry A; Skaar, Eric P.

Proc Natl Acad Sci U S A ; 114(32): E6652-E6659, 2017 08 08.

Artigo em Inglês | MEDLINE | ID: mdl-28739897

RESUMO

Gram-positive bacteria cause the majority of skin and soft tissue infections (SSTIs), resulting in the most common reason for clinic visits in the United States. Recently, it was discovered that Gram-positive pathogens use a unique heme biosynthesis pathway, which implicates this pathway as a target for development of antibacterial therapies. We report here the identification of a small-molecule activator of coproporphyrinogen oxidase (CgoX) from Gram-positive bacteria, an enzyme essential for heme biosynthesis. Activation of CgoX induces accumulation of coproporphyrin III and leads to photosensitization of Gram-positive pathogens. In combination with light, CgoX activation reduces bacterial burden in murine models of SSTI. Thus, small-molecule activation of CgoX represents an effective strategy for the development of light-based antimicrobial therapies.

Assuntos

Proteínas de Bactérias/metabolismo , Coproporfirinogênio Oxidase/metabolismo , Coproporfirinas/biossíntese , Fármacos Fotossensibilizantes/metabolismo , Fototerapia , Infecções Cutâneas Estafilocócicas/enzimologia , Infecções Cutâneas Estafilocócicas/terapia , Staphylococcus aureus/metabolismo , Animais , Proteínas de Bactérias/genética , Coproporfirinogênio Oxidase/genética , Coproporfirinas/genética , Modelos Animais de Doenças , Camundongos , Staphylococcus aureus/genética

Membrane protein contact and structure prediction using co-evolution in conjunction with machine learning.

Teixeira, Pedro L; Mendenhall, Jeff L; Heinze, Sten; Weiner, Brian; Skwark, Marcin J; Meiler, Jens.

PLoS One ; 12(5): e0177866, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-28542325

RESUMO

De novo membrane protein structure prediction is limited to small proteins due to the conformational search space quickly expanding with length. Long-range contacts (24+ amino acid separation)-residue positions distant in sequence, but in close proximity in the structure, are arguably the most effective way to restrict this conformational space. Inverse methods for co-evolutionary analysis predict a global set of position-pair couplings that best explain the observed amino acid co-occurrences, thus distinguishing between evolutionarily explained co-variances and these arising from spurious transitive effects. Here, we show that applying machine learning approaches and custom descriptors improves evolutionary contact prediction accuracy, resulting in improvement of average precision by 6 percentage points for the top 1L non-local contacts. Further, we demonstrate that predicted contacts improve protein folding with BCL::Fold. The mean RMSD100 metric for the top 10 models folded was reduced by an average of 2 Å for a benchmark of 25 membrane proteins.

Assuntos

Aprendizado de Máquina , Proteínas de Membrana/metabolismo , Modelos Moleculares , Dobramento de Proteína , Estrutura Secundária de Proteína/fisiologia , Algoritmos , Sequência de Aminoácidos , Humanos

Evaluating electronic health record data sources and algorithmic approaches to identify hypertensive individuals.

Teixeira, Pedro L; Wei, Wei-Qi; Cronin, Robert M; Mo, Huan; VanHouten, Jacob P; Carroll, Robert J; LaRose, Eric; Bastarache, Lisa A; Rosenbloom, S Trent; Edwards, Todd L; Roden, Dan M; Lasko, Thomas A; Dart, Richard A; Nikolai, Anne M; Peissig, Peggy L; Denny, Joshua C.

J Am Med Inform Assoc ; 24(1): 162-171, 2017 01.

Artigo em Inglês | MEDLINE | ID: mdl-27497800

RESUMO

OBJECTIVE: Phenotyping algorithms applied to electronic health record (EHR) data enable investigators to identify large cohorts for clinical and genomic research. Algorithm development is often iterative, depends on fallible investigator intuition, and is time- and labor-intensive. We developed and evaluated 4 types of phenotyping algorithms and categories of EHR information to identify hypertensive individuals and controls and provide a portable module for implementation at other sites. MATERIALS AND METHODS: We reviewed the EHRs of 631 individuals followed at Vanderbilt for hypertension status. We developed features and phenotyping algorithms of increasing complexity. Input categories included International Classification of Diseases, Ninth Revision (ICD9) codes, medications, vital signs, narrative-text search results, and Unified Medical Language System (UMLS) concepts extracted using natural language processing (NLP). We developed a module and tested portability by replicating 10 of the best-performing algorithms at the Marshfield Clinic. RESULTS: Random forests using billing codes, medications, vitals, and concepts had the best performance with a median area under the receiver operator characteristic curve (AUC) of 0.976. Normalized sums of all 4 categories also performed well (0.959 AUC). The best non-NLP algorithm combined normalized ICD9 codes, medications, and blood pressure readings with a median AUC of 0.948. Blood pressure cutoffs or ICD9 code counts alone had AUCs of 0.854 and 0.908, respectively. Marshfield Clinic results were similar. CONCLUSION: This work shows that billing codes or blood pressure readings alone yield good hypertension classification performance. However, even simple combinations of input categories improve performance. The most complex algorithms classified hypertension with excellent recall and precision.

Assuntos

Algoritmos , Registros Eletrônicos de Saúde , Hipertensão/diagnóstico , Aprendizado de Máquina , Idoso , Determinação da Pressão Arterial , Codificação Clínica , Feminino , Humanos , Armazenamento e Recuperação da Informação/métodos , Masculino , Pessoa de Meia-Idade , Processamento de Linguagem Natural , Fenótipo , Curva ROC

Genetic Risk Prediction of Atrial Fibrillation.

Lubitz, Steven A; Yin, Xiaoyan; Lin, Henry J; Kolek, Matthew; Smith, J Gustav; Trompet, Stella; Rienstra, Michiel; Rost, Natalia S; Teixeira, Pedro L; Almgren, Peter; Anderson, Christopher D; Chen, Lin Y; Engström, Gunnar; Ford, Ian; Furie, Karen L; Guo, Xiuqing; Larson, Martin G; Lunetta, Kathryn L; Macfarlane, Peter W; Psaty, Bruce M; Soliman, Elsayed Z; Sotoodehnia, Nona; Stott, David J; Taylor, Kent D; Weng, Lu-Chen; Yao, Jie; Geelhoed, Bastiaan; Verweij, Niek; Siland, Joylene E; Kathiresan, Sekar; Roselli, Carolina; Roden, Dan M; van der Harst, Pim; Darbar, Dawood; Jukema, J Wouter; Melander, Olle; Rosand, Jonathan; Rotter, Jerome I; Heckbert, Susan R; Ellinor, Patrick T; Alonso, Alvaro; Benjamin, Emelia J.

Circulation ; 135(14): 1311-1320, 2017 Apr 04.

Artigo em Inglês | MEDLINE | ID: mdl-27793994

RESUMO

BACKGROUND: Atrial fibrillation (AF) has a substantial genetic basis. Identification of individuals at greatest AF risk could minimize the incidence of cardioembolic stroke. METHODS: To determine whether genetic data can stratify risk for development of AF, we examined associations between AF genetic risk scores and incident AF in 5 prospective studies comprising 18 919 individuals of European ancestry. We examined associations between AF genetic risk scores and ischemic stroke in a separate study of 509 ischemic stroke cases (202 cardioembolic [40%]) and 3028 referents. Scores were based on 11 to 719 common variants (≥5%) associated with AF at P values ranging from <1×10-3 to <1×10-8 in a prior independent genetic association study. RESULTS: Incident AF occurred in 1032 individuals (5.5%). AF genetic risk scores were associated with new-onset AF after adjustment for clinical risk factors. The pooled hazard ratio for incident AF for the highest versus lowest quartile of genetic risk scores ranged from 1.28 (719 variants; 95% confidence interval, 1.13-1.46; P=1.5×10-4) to 1.67 (25 variants; 95% confidence interval, 1.47-1.90; P=9.3×10-15). Discrimination of combined clinical and genetic risk scores varied across studies and scores (maximum C statistic, 0.629-0.811; maximum ΔC statistic from clinical score alone, 0.009-0.017). AF genetic risk was associated with stroke in age- and sex-adjusted models. For example, individuals in the highest versus lowest quartile of a 127-variant score had a 2.49-fold increased odds of cardioembolic stroke (95% confidence interval, 1.39-4.58; P=2.7×10-3). The effect persisted after the exclusion of individuals (n=70) with known AF (odds ratio, 2.25; 95% confidence interval, 1.20-4.40; P=0.01). CONCLUSIONS: Comprehensive AF genetic risk scores were associated with incident AF beyond associations for clinical AF risk factors but offered small improvements in discrimination. AF genetic risk was also associated with cardioembolic stroke in age- and sex-adjusted analyses. Efforts are warranted to determine whether AF genetic risk may improve identification of subclinical AF or help distinguish between stroke mechanisms.

Assuntos

Fibrilação Atrial/genética , Idoso , Feminino , Humanos , Incidência , Masculino , Pessoa de Meia-Idade , Fatores de Risco

Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance.

Wei, Wei-Qi; Teixeira, Pedro L; Mo, Huan; Cronin, Robert M; Warner, Jeremy L; Denny, Joshua C.

J Am Med Inform Assoc ; 23(e1): e20-7, 2016 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-26338219

RESUMO

OBJECTIVE: To evaluate the phenotyping performance of three major electronic health record (EHR) components: International Classification of Disease (ICD) diagnosis codes, primary notes, and specific medications. MATERIALS AND METHODS: We conducted the evaluation using de-identified Vanderbilt EHR data. We preselected ten diseases: atrial fibrillation, Alzheimer's disease, breast cancer, gout, human immunodeficiency virus infection, multiple sclerosis, Parkinson's disease, rheumatoid arthritis, and types 1 and 2 diabetes mellitus. For each disease, patients were classified into seven categories based on the presence of evidence in diagnosis codes, primary notes, and specific medications. Twenty-five patients per disease category (a total number of 175 patients for each disease, 1750 patients for all ten diseases) were randomly selected for manual chart review. Review results were used to estimate the positive predictive value (PPV), sensitivity, andF-score for each EHR component alone and in combination. RESULTS: The PPVs of single components were inconsistent and inadequate for accurately phenotyping (0.06-0.71). Using two or more ICD codes improved the average PPV to 0.84. We observed a more stable and higher accuracy when using at least two components (mean ± standard deviation: 0.91 ± 0.08). Primary notes offered the best sensitivity (0.77). The sensitivity of ICD codes was 0.67. Again, two or more components provided a reasonably high and stable sensitivity (0.59 ± 0.16). Overall, the best performance (Fscore: 0.70 ± 0.12) was achieved by using two or more components. Although the overall performance of using ICD codes (0.67 ± 0.14) was only slightly lower than using two or more components, its PPV (0.71 ± 0.13) is substantially worse (0.91 ± 0.08). CONCLUSION: Multiple EHR components provide a more consistent and higher performance than a single one for the selected phenotypes. We suggest considering multiple EHR components for future phenotyping design in order to obtain an ideal result.

Assuntos

Algoritmos , Registros Eletrônicos de Saúde , Classificação Internacional de Doenças , Fenótipo , Diagnóstico , Humanos , Registros Médicos Orientados a Problemas , Valor Preditivo dos Testes

Benchmarking ligand-based virtual High-Throughput Screening with the PubChem database.

Butkiewicz, Mariusz; Lowe, Edward W; Mueller, Ralf; Mendenhall, Jeffrey L; Teixeira, Pedro L; Weaver, C David; Meiler, Jens.

Molecules ; 18(1): 735-56, 2013 Jan 08.

Artigo em Inglês | MEDLINE | ID: mdl-23299552

RESUMO

With the rapidly increasing availability of High-Throughput Screening (HTS) data in the public domain, such as the PubChem database, methods for ligand-based computer-aided drug discovery (LB-CADD) have the potential to accelerate and reduce the cost of probe development and drug discovery efforts in academia. We assemble nine data sets from realistic HTS campaigns representing major families of drug target proteins for benchmarking LB-CADD methods. Each data set is public domain through PubChem and carefully collated through confirmation screens validating active compounds. These data sets provide the foundation for benchmarking a new cheminformatics framework BCL::ChemInfo, which is freely available for non-commercial use. Quantitative structure activity relationship (QSAR) models are built using Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Decision Trees (DTs), and Kohonen networks (KNs). Problem-specific descriptor optimization protocols are assessed including Sequential Feature Forward Selection (SFFS) and various information content measures. Measures of predictive power and confidence are evaluated through cross-validation, and a consensus prediction scheme is tested that combines orthogonal machine learning algorithms into a single predictor. Enrichments ranging from 15 to 101 for a TPR cutoff of 25% are observed.

Assuntos

Bases de Dados de Compostos Químicos/normas , Ensaios de Triagem em Larga Escala/normas , Relação Quantitativa Estrutura-Atividade , Algoritmos , Animais , Área Sob a Curva , Simulação por Computador , Árvores de Decisões , Descoberta de Drogas/normas , Humanos , Concentração Inibidora 50 , Ligantes , Modelos Químicos , Redes Neurais de Computação , Melhoria de Qualidade , Curva ROC , Máquina de Vetores de Suporte

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA