Your browser doesn't support javascript.
An NLP tool for data extraction from electronic health records: COVID-19 mortalities and comorbidities.
BuHamra, Sana S; Almutairi, Abdullah N; Buhamrah, Abdullah K; Almadani, Sabah H; Alibrahim, Yusuf A.
  • BuHamra SS; Department of Information Science, Kuwait University, Kuwait City, Kuwait.
  • Almutairi AN; Department of Information Science, Kuwait University, Kuwait City, Kuwait.
  • Buhamrah AK; Surgery Department, Al-Adan Hospital, Al Ahmadi, Kuwait.
  • Almadani SH; Department of Information Science, Kuwait University, Kuwait City, Kuwait.
  • Alibrahim YA; Department of Medical Imaging, Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
Front Public Health ; 10: 1070870, 2022.
Article in English | MEDLINE | ID: covidwho-2199554
ABSTRACT

Background:

The high infection rate, severe symptoms, and evolving aspects of the COVID-19 pandemic provide challenges for a variety of medical systems around the world. Automatic information retrieval from unstructured text is greatly aided by Natural Language Processing (NLP), the primary approach taken in this field. This study addresses COVID-19 mortality data from the intensive care unit (ICU) in Kuwait during the first 18 months of the pandemic. A key goal is to extract and classify the primary and intermediate causes of death from electronic health records (EHRs) in a timely way. In addition, comorbid conditions or concurrent diseases were retrieved and analyzed in relation to a variety of causes of mortality.

Method:

An NLP system using the Python programming language is constructed to automate the process of extracting primary and secondary causes of death, as well as comorbidities. The system is capable of handling inaccurate and messy data, this includes inadequate formats, spelling mistakes and mispositioned information. A machine learning decision trees method is used to classify the causes of death.

Results:

For 54.8% of the 1691 ICU patients we studied, septic shock or sepsis-related multiorgan failure was the leading cause of mortality. About three-quarters of patients die from acute respiratory distress syndrome (ARDS), a common intermediate cause of death. An arrhythmia (AF) disorder was determined to be the strongest predictor of intermediate cause of death, whether caused by ARDS or other causes.

Conclusion:

We created an NLP system to automate the extraction of causes of death and comorbidities from EHRs. Our method processes messy and erroneous data and classifies the primary and intermediate causes of death of COVID-19 patients. We advocate arranging the EHR with well-defined sections and menu-driven options to reduce incorrect forms.
Subject(s)
Keywords

Full text: Available Collection: International databases Database: MEDLINE Main subject: Respiratory Distress Syndrome / COVID-19 Type of study: Observational study / Prognostic study / Reviews Limits: Humans Language: English Journal: Front Public Health Year: 2022 Document Type: Article Affiliation country: Fpubh.2022.1070870

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: Respiratory Distress Syndrome / COVID-19 Type of study: Observational study / Prognostic study / Reviews Limits: Humans Language: English Journal: Front Public Health Year: 2022 Document Type: Article Affiliation country: Fpubh.2022.1070870