Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
Stud Health Technol Inform ; 310: 649-653, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38269889

RESUMO

Several studies have shown that about 80% of the medical information in an electronic health record is only available through unstructured data. Resources such as medical terminologies in languages other than English are limited and restrain the NLP tools. We propose here to leverage English based resources in other languages using a combination of translation, word alignment, entity extraction and term normalization (TAXN). We implement this extraction pipeline in an open-source library called "medkit". We demonstrate the interest of this approach through a specific use-case: enriching a phenotypic dictionary for post-acute sequelae in COVID-19 (PASC). TAXN proved to be efficient to propose new synonyms of UMLS terms using a corpus of 70 articles in French with 356 terms enriched with at least one validated new synonym. This study was based on freely available deep-learning models.


Assuntos
Multilinguismo , Humanos , Idioma , Progressão da Doença , Registros Eletrônicos de Saúde
2.
Intensive Care Med ; 49(1): 26-36, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36446854

RESUMO

PURPOSE: Compliance to the Surviving Sepsis Campaign (SSC) guidelines is limited. This is known to be associated with increased mortality. The aim of this retrospective cohort study was to identify among the SCC guidelines the optimal bundle of recommendations that minimize 28-day mortality. METHODS: We used a training cohort to identify, using a least absolute shrinkage and selection operator penalized machine learning model, this bundle. Patients with sepsis/septic shock admitted to the intensive care unit (ICU) were extracted from two US databases, the Medical Information Mart for Intensive Care-IV (MIMIC-IV) database (training and internal validation cohorts) and the eICU Collaborative Research Database (eICU-CRD) (external validation cohort). In the validation cohorts, we defined a bundle group that includes patients who were treated with at least all the recommendations selected in our bundle and a no-bundle group that includes patients in whom at least one recommendation from our bundle was omitted. RESULTS: All-cause 28-day mortality was the primary outcome measure. A total of 42,735 patients were included. Six recommendations (antimicrobials, balanced crystalloid, insulin therapy, corticosteroids, vasopressin, and bicarbonate therapy) were identified from the training cohort to be included in our bundle. In the propensity score-(PS)-matched internal validation cohort, the bundle group was associated with a lower mortality (OR 0.41 [0.33-0.53]; p < 0.001) compared to the no-bundle group. This was confirmed in the PS-matched external validation cohort (OR 0.75 [0.60-0.94]; p 0.02). CONCLUSION: Our bundle of six recommendations is associated with a dramatic reduction in mortality in sepsis and septic shock. This bundle needs to be evaluated prospectively.


Assuntos
Sepse , Choque Séptico , Humanos , Choque Séptico/terapia , Estudos Retrospectivos , Tempo de Internação , Fidelidade a Diretrizes , Sepse/terapia , Unidades de Terapia Intensiva , Mortalidade Hospitalar
3.
Arterioscler Thromb Vasc Biol ; 42(12): 1471-1481, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36325900

RESUMO

BACKGROUND: To examine the association of ultrasensitive cTnI (cardiac troponin I) with incident cardiovascular disease events (CVDs) in the primary prevention setting. METHODS: cTnI was analyzed in the baseline plasma (2008-2012) of CVD-free volunteers from the Paris Prospective Study III using a novel ultrasensitive immunoassay (Simoa Troponin-I 2.0 Kit, Quanterix, Lexington) with a limit of detection of 0.013 pg/mL. Incident CVD hospitalizations (coronary heart disease, stroke, cardiac arrhythmias, deep venous thrombosis or pulmonary embolism, heart failure, or arterial aneurysm) were validated by critical review of the hospital records. Hazard ratios were estimated per log-transformed SD increase of cTnI in Cox models using age as the time scale. RESULTS: The study population includes 9503 participants (40% women) aged 59.6 (6.3) years. cTnI was detected in 99.6% of the participants (median value=0.63 pg/mL, interquartile range, 0.39-1.09). After a median follow-up of 8.34 years (interquartile range, 8.0-10.07), 516 participants suffered 612 events. In fully adjusted analysis, higher cTnI (per 1 SD increase of log cTnI) was significantly associated with CVD events combined (hazard ratio, 1.18 [1.08-1.30]). Among all single risk factors, cTnI had the highest discrimination capacity for incident CVD events (C index=0.6349). Adding log cTnI to the SCORE 2 (Systematic Coronary Risk Evaluation) risk improved moderately discriminatory capacity (C index 0.698 versus 0.685; bootstrapped C index difference: 0.0135 [95% CI, 0.0131-0.0138]), and reclassification of the participants (categorical net reclassification index, 0.0628 [95% CI, 0.023-0.102]). Findings were consistent using the US pooled cohort risk equation. CONCLUSIONS: Ultrasensitive cTnI is an independent marker of CVD events in the primary prevention setting.


Assuntos
Doenças Cardiovasculares , Troponina I , Feminino , Humanos , Masculino , Biomarcadores , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/epidemiologia , Prognóstico , Estudos Prospectivos , Fatores de Risco , Pessoa de Meia-Idade
4.
Stud Health Technol Inform ; 290: 282-286, 2022 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-35673018

RESUMO

With the development of clinical databases and the ubiquity of EHRs, physicians and researchers alike have access to an unprecedented amount of data. Complexity of the available data has also increased since clinical reports are also included and require frameworks with natural language processing capabilities in order to process them and extract information not found in other types of documents. In the following work we implement a data processing pipeline performing phenotyping, disambiguation, negation and subject prediction on such reports. We compare it to an existing solution routinely used in a children's hospital with special focus on genetic diseases. We show that by replacing components based on rules and pattern matching with components leveraging deep learning models and fine-tuned word embeddings we obtain performance improvements of 7%, 10% and 27% in terms of F1 measure for each task. The solution we devised will help build more reliable decision support systems.


Assuntos
Aprendizado Profundo , Criança , Bases de Dados Factuais , Humanos , Processamento de Linguagem Natural
5.
J Am Coll Cardiol ; 79(18): 1818-1827, 2022 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-35512862

RESUMO

BACKGROUND: Although sudden cardiac death (SCD) is recognized as a high-priority public health topic, reliable estimates of the incidence of SCD or, more broadly, out-of-hospital cardiac arrest (OHCA), in the population are scarce, especially in the European Union. OBJECTIVES: The study objective was to determine the incidence of SCD and OHCA in the European Union. METHODS: The study examined 4 large (ie, >2 million inhabitants) European population-based prospective registries collecting emergency medical services (EMS)-attended (ie, with attempted resuscitation) OHCA and SCD (OHCA without obvious extracardiac causes) for >5 consecutive years from January 2012 to December 2017 in the Paris region (France), the North Holland region (the Netherlands), the Stockholm region (Sweden), and in all of Denmark. RESULTS: The average annual incidence of SCD in the 4 registries ranged from 36.8 per 100,000 (95% CI: 23.5-50.1 per 100,000) to 39.7 per 100,000 (95% CI: 32.6-46.8 per 100,000). When extrapolating to each European country and accounting for age and sex, this yields to 249,538 SCD cases per year (95% CI: 155,377-343,719 SCD cases per year). The average annual incidence of OHCA in the 4 registries ranged from 47.8 per 100,000 (95% CI: 21.2-74.4 per 100,000) to 57.9 per 100,000 (95% CI: 19.6-96.3 per 100,000), corresponding to 343,496 OHCA cases per year (95% CI: 216,472-464,922 OHCA cases per year) in the European Union. Incidence rates of SCD and OHCA increased with age and were systematically higher in men compared with women. CONCLUSIONS: By combining data from 4 large, population-based registries with at least 5 years of data collection, this study provided an estimate of the incidence of SCD and OHCA in the European Union.


Assuntos
Reanimação Cardiopulmonar , Serviços Médicos de Emergência , Parada Cardíaca Extra-Hospitalar , Morte Súbita Cardíaca/epidemiologia , União Europeia , Feminino , Humanos , Incidência , Masculino , Parada Cardíaca Extra-Hospitalar/epidemiologia , Parada Cardíaca Extra-Hospitalar/terapia , Estudos Prospectivos , Sistema de Registros
7.
JMIR Med Inform ; 10(3): e35190, 2022 Mar 30.
Artigo em Inglês | MEDLINE | ID: mdl-35275837

RESUMO

BACKGROUND: Patients hospitalized for a given condition may be receiving other treatments for other contemporary conditions or comorbidities. The use of such observational clinical data for pharmacological hypothesis generation is appealing in the context of an emerging disease but particularly challenging due to the presence of drug indication bias. OBJECTIVE: With this study, our main objective was the development and validation of a fully data-driven pipeline that would address this challenge. Our secondary objective was to generate pharmacological hypotheses in patients with COVID-19 and demonstrate the clinical relevance of the pipeline. METHODS: We developed a pharmacopeia-wide association study (PharmWAS) pipeline inspired from the PheWAS methodology, which systematically screens for associations between the whole pharmacopeia and a clinical phenotype. First, a fully data-driven procedure based on adaptive least absolute shrinkage and selection operator (LASSO) determined drug-specific adjustment sets. Second, we computed several measures of association, including robust methods based on propensity scores (PSs) to control indication bias. Finally, we applied the Benjamini and Hochberg procedure of the false discovery rate (FDR). We applied this method in a multicenter retrospective cohort study using electronic medical records from 16 university hospitals of the Greater Paris area. We included all adult patients between 18 and 95 years old hospitalized in conventional wards for COVID-19 between February 1, 2020, and June 15, 2021. We investigated the association between drug prescription within 48 hours from admission and 28-day mortality. We validated our data-driven pipeline against a knowledge-based pipeline on 3 treatments of reference, for which experts agreed on the expected association with mortality. We then demonstrated its clinical relevance by screening all drugs prescribed in more than 100 patients to generate pharmacological hypotheses. RESULTS: A total of 5783 patients were included in the analysis. The median age at admission was 69.2 (IQR 56.7-81.1) years, and 3390 (58.62%) of the patients were male. The performance of our automated pipeline was comparable or better for controlling bias than the knowledge-based adjustment set for 3 reference drugs: dexamethasone, phloroglucinol, and paracetamol. After correction for multiple testing, 4 drugs were associated with increased in-hospital mortality. Among these, diazepam and tramadol were the only ones not discarded by automated diagnostics, with adjusted odds ratios of 2.51 (95% CI 1.52-4.16, Q=.1) and 1.94 (95% CI 1.32-2.85, Q=.02), respectively. CONCLUSIONS: Our innovative approach proved useful in generating pharmacological hypotheses in an outbreak setting, without requiring a priori knowledge of the disease. Our systematic analysis of early prescribed treatments from patients hospitalized for COVID-19 showed that diazepam and tramadol are associated with increased 28-day mortality. Whether these drugs could worsen COVID-19 needs to be further assessed.

8.
JMIR Med Inform ; 9(3): e17934, 2021 Mar 16.
Artigo em Inglês | MEDLINE | ID: mdl-33724196

RESUMO

BACKGROUND: Information related to patient medication is crucial for health care; however, up to 80% of the information resides solely in unstructured text. Manual extraction is difficult and time-consuming, and there is not a lot of research on natural language processing extracting medical information from unstructured text from French corpora. OBJECTIVE: We aimed to develop a system to extract medication-related information from clinical text written in French. METHODS: We developed a hybrid system combining an expert rule-based system, contextual word embedding (embedding for language model) trained on clinical notes, and a deep recurrent neural network (bidirectional long short term memory-conditional random field). The task consisted of extracting drug mentions and their related information (eg, dosage, frequency, duration, route, condition). We manually annotated 320 clinical notes from a French clinical data warehouse to train and evaluate the model. We compared the performance of our approach to those of standard approaches: rule-based or machine learning only and classic word embeddings. We evaluated the models using token-level recall, precision, and F-measure. RESULTS: The overall F-measure was 89.9% (precision 90.8; recall: 89.2) when combining expert rules and contextualized embeddings, compared to 88.1% (precision 89.5; recall 87.2) without expert rules or contextualized embeddings. The F-measures for each category were 95.3% for medication name, 64.4% for drug class mentions, 95.3% for dosage, 92.2% for frequency, 78.8% for duration, and 62.2% for condition of the intake. CONCLUSIONS: Associating expert rules, deep contextualized embedding, and deep neural networks improved medication information extraction. Our results revealed a synergy when associating expert knowledge and latent knowledge.

10.
J Med Internet Res ; 22(8): e20773, 2020 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-32759101

RESUMO

BACKGROUND: A novel disease poses special challenges for informatics solutions. Biomedical informatics relies for the most part on structured data, which require a preexisting data or knowledge model; however, novel diseases do not have preexisting knowledge models. In an emergent epidemic, language processing can enable rapid conversion of unstructured text to a novel knowledge model. However, although this idea has often been suggested, no opportunity has arisen to actually test it in real time. The current coronavirus disease (COVID-19) pandemic presents such an opportunity. OBJECTIVE: The aim of this study was to evaluate the added value of information from clinical text in response to emergent diseases using natural language processing (NLP). METHODS: We explored the effects of long-term treatment by calcium channel blockers on the outcomes of COVID-19 infection in patients with high blood pressure during in-patient hospital stays using two sources of information: data available strictly from structured electronic health records (EHRs) and data available through structured EHRs and text mining. RESULTS: In this multicenter study involving 39 hospitals, text mining increased the statistical power sufficiently to change a negative result for an adjusted hazard ratio to a positive one. Compared to the baseline structured data, the number of patients available for inclusion in the study increased by 2.95 times, the amount of available information on medications increased by 7.2 times, and the amount of additional phenotypic information increased by 11.9 times. CONCLUSIONS: In our study, use of calcium channel blockers was associated with decreased in-hospital mortality in patients with COVID-19 infection. This finding was obtained by quickly adapting an NLP pipeline to the domain of the novel disease; the adapted pipeline still performed sufficiently to extract useful information. When that information was used to supplement existing structured data, the sample size could be increased sufficiently to see treatment effects that were not previously statistically detectable.


Assuntos
Betacoronavirus , Bloqueadores dos Canais de Cálcio/uso terapêutico , Infecções por Coronavirus/tratamento farmacológico , Hipertensão/complicações , Processamento de Linguagem Natural , Pneumonia Viral/tratamento farmacológico , COVID-19 , Infecções por Coronavirus/complicações , Mineração de Dados , Registros Eletrônicos de Saúde , Humanos , Pandemias , Pneumonia Viral/complicações , SARS-CoV-2 , Fatores de Tempo , Tratamento Farmacológico da COVID-19
11.
J Biomed Inform ; 102: 103356, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31837473

RESUMO

OBJECTIVE: We aimed to enhance the performance of a supervised model for clinical named-entity recognition (NER) using medical terminologies. In order to evaluate our system in French, we built a corpus for 5 types of clinical entities. METHODS: We used a terminology-based system as baseline, built upon UMLS and SNOMED. Then, we evaluated a biGRU-CRF, and a hybrid system using the prediction of the terminology-based system as feature for the biGRU-CRF. In French, we built APcNER, a corpus of 147 documents annotated for 5 entities (Drug names, Signs or symptoms, Diseases or disorders, Diagnostic procedures or lab tests and Therapeutic procedures). We evaluated each NER systems using exact and partial match definition of F-measure for NER. The APcNER contains 4,837 entities, which took 28 h to annotate. The inter-annotator agreement as measured by Cohen's Kappa was substantial for non-exact match (Κ = 0.61) and moderate considering exact match (Κ = 0.42). In English, we evaluated the NER systems on the i2b2-2009 Medication Challenge for Drug name recognition, which contained 8,573 entities for 268 documents, and i2b2-small a version reduced to match APcNER number of entities. RESULTS: For drug name recognition on both i2b2-2009 and APcNER, the biGRU-CRF performed better that the terminology-based system, with an exact-match F-measure of 91.1% versus 73% and 81.9% versus 75% respectively. For i2b2-small and APcNER, the hybrid system outperformed the biGRU-CRF, with an exact-match F-measure of 87.8% versus 85.6% and 86.4% versus 81.9% respectively. On APcNER corpus, the micro-average F-measure of the hybrid system on the 5 entities was 69.5% in exact match and 84.1% in non-exact match. CONCLUSION: APcNER is a French corpus for clinical-NER of five types of entities which covers a large variety of document types. The extension of the supervised model with terminology has allowed an easy increase in performance, especially for rare entities, and established near state of the art results on the i2b2-2009 corpus.


Assuntos
Processamento de Linguagem Natural , Redes Neurais de Computação , Terminologia como Assunto , Idioma
12.
J Clin Epidemiol ; 108: 86-94, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30528791

RESUMO

OBJECTIVES: We aimed to develop and evaluate an algorithm for automatically screening citations when updating living network meta-analysis (NMA). STUDY DESIGN AND SETTING: Our algorithm learns from the initial screening of citations conducted when creating an NMA to automatically identify eligible citations (i.e., needing full-text consideration) when updating the NMA. We evaluated our algorithm on four NMAs from different medical domains. For each NMA we constructed sets of initially screened citations and citations to screen during an update that took place 2 years after the conduct of the NMA. We encoded free text of citations (title and abstract) using word embeddings. On top of this vectorized representation, we fitted a logistic regression model to the set of initially screened citations to predict the eligibility of citations screened during an update. RESULTS: Our algorithm achieved 100% sensitivity on two NMAs (100% [95% confidence interval 93-100] and 100% [40-100] sensitivity), and 94% (81-99) and 97% (86-100) on the remaining two others. For all NMAs, our algorithm would have spared to manually screen 1,345 of 2,530 citations, decreasing the workload by 53% (51-55), while missing 3 of 124 eligible citations (2% [1-7]), none of which were finally included in the NMAs after full-text consideration. CONCLUSION: For updating an NMA after 2 years, our algorithm considerably diminished the workload required for screening, and the number of missed eligible citations remained low.


Assuntos
Algoritmos , Armazenamento e Recuperação da Informação/métodos , Metanálise em Rede , Intervalos de Confiança , Medicina Baseada em Evidências/métodos , Humanos , Ensaios Clínicos Controlados Aleatórios como Assunto , Máquina de Vetores de Suporte , Carga de Trabalho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...