Your browser doesn't support javascript.
Mining of EHR for interface terminology concepts for annotating EHRs of COVID patients.
Keloth, Vipina K; Zhou, Shuxin; Lindemann, Luke; Zheng, Ling; Elhanan, Gai; Einstein, Andrew J; Geller, James; Perl, Yehoshua.
  • Keloth VK; School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA. vk396@njit.edu.
  • Zhou S; Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA.
  • Lindemann L; School of Medicine and Health Sciences, The George Washington University, Washington (D.C.), USA.
  • Zheng L; Computer Science and Software Engineering Department, Monmouth University, West Long Branch, NJ, USA.
  • Elhanan G; Renown Institute for Health Innovation, Desert Research Institute, Reno, NV, USA.
  • Einstein AJ; Cardiology Division, Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA.
  • Geller J; Department of Radiology, Columbia University Irving Medical Center, New York, NY, USA.
  • Perl Y; Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA.
BMC Med Inform Decis Mak ; 23(Suppl 1): 40, 2023 02 24.
Artículo en Inglés | MEDLINE | ID: covidwho-2265954
ABSTRACT

BACKGROUND:

Two years into the COVID-19 pandemic and with more than five million deaths worldwide, the healthcare establishment continues to struggle with every new wave of the pandemic resulting from a new coronavirus variant. Research has demonstrated that there are variations in the symptoms, and even in the order of symptom presentations, in COVID-19 patients infected by different SARS-CoV-2 variants (e.g., Alpha and Omicron). Textual data in the form of admission notes and physician notes in the Electronic Health Records (EHRs) is rich in information regarding the symptoms and their orders of presentation. Unstructured EHR data is often underutilized in research due to the lack of annotations that enable automatic extraction of useful information from the available extensive volumes of textual data.

METHODS:

We present the design of a COVID Interface Terminology (CIT), not just a generic COVID-19 terminology, but one serving a specific purpose of enabling automatic annotation of EHRs of COVID-19 patients. CIT was constructed by integrating existing COVID-related ontologies and mining additional fine granularity concepts from clinical notes. The iterative mining approach utilized the techniques of 'anchoring' and 'concatenation' to identify potential fine granularity concepts to be added to the CIT. We also tested the generalizability of our approach on a hold-out dataset and compared the annotation coverage to the coverage obtained for the dataset used to build the CIT.

RESULTS:

Our experiments demonstrate that this approach results in higher annotation coverage compared to existing ontologies such as SNOMED CT and Coronavirus Infectious Disease Ontology (CIDO). The final version of CIT achieved about 20% more coverage than SNOMED CT and 50% more coverage than CIDO. In the future, the concepts mined and added into CIT could be used as training data for machine learning models for mining even more concepts into CIT and further increasing the annotation coverage.

CONCLUSION:

In this paper, we demonstrated the construction of a COVID interface terminology that can be utilized for automatically annotating EHRs of COVID-19 patients. The techniques presented can identify frequently documented fine granularity concepts that are missing in other ontologies thereby increasing the annotation coverage.
Asunto(s)
Palabras clave

Texto completo: Disponible Colección: Bases de datos internacionales Base de datos: MEDLINE Asunto principal: Registros Electrónicos de Salud / COVID-19 Tipo de estudio: Estudio pronóstico Tópicos: Variantes Límite: Humanos Idioma: Inglés Revista: BMC Med Inform Decis Mak Asunto de la revista: Informática Médica Año: 2023 Tipo del documento: Artículo País de afiliación: S12911-023-02136-0

Similares

MEDLINE

...
LILACS

LIS


Texto completo: Disponible Colección: Bases de datos internacionales Base de datos: MEDLINE Asunto principal: Registros Electrónicos de Salud / COVID-19 Tipo de estudio: Estudio pronóstico Tópicos: Variantes Límite: Humanos Idioma: Inglés Revista: BMC Med Inform Decis Mak Asunto de la revista: Informática Médica Año: 2023 Tipo del documento: Artículo País de afiliación: S12911-023-02136-0