Search | Global Index Medicus

Improving classification of low-resource COVID-19 literature by using Named Entity Recognition

Oscar LITHGOW-SERRANO; Joseph CORNELIUS; Vani KANJIRANGAT; Carlos-Francisco MÉNDEZ-CRUZ; Fabio RINALDI.

Genomics & Informatics ; : e22-2021.

Article in English | WPRIM | ID: wpr-914346

ABSTRACT

Automatic document classification for highly interrelated classes is a demanding task that becomes more challenging when there is little labeled data for training. Such is the case of the coronavirus disease 2019 (COVID-19) Clinical repository—a repository of classified and translated academic articles related to COVID-19 and relevant to the clinical practice—where a 3-way classification scheme is being applied to COVID-19 literature. During the 7th Biomedical Linked Annotation Hackathon (BLAH7) hackathon, we performed experiments to explore the use of named-entity-recognition (NER) to improve the classification. We processed the literature with OntoGene’s Biomedical Entity Recogniser (OGER) and used the resulting identified Named Entities (NE) and their links to major biological databases as extra input features for the classifier. We compared the results with a baseline model without the OGER extracted features. In these proof-of-concept experiments, we observed a clear gain on COVID-19 literature classification. In particular, NE’s origin was useful to classify document types and NE’s type for clinical specialties. Due to the limitations of the small dataset, we can only conclude that our results suggests that NER would benefit this classification task. In order to accurately estimate this benefit, further experiments with a larger dataset would be needed.

Introduction to BLAH5 special issue: recent progress on interoperability of biomedical text mining

Jin-Dong KIM; Kevin-Bretonnel COHEN; Nigel COLLIER; Zhiyong LU; Fabio RINALDI.

Genomics & Informatics ; : e12-2019.

Article in English | WPRIM | ID: wpr-763812

ABSTRACT

No abstract available.

Subject(s)

Data Mining

Improving spaCy dependency annotation and PoS tagging web service using independent NER services

Nico COLIC; Fabio RINALDI.

Genomics & Informatics ; : e21-2019.

Article in English | WPRIM | ID: wpr-763803

ABSTRACT

Dependency parsing is often used as a component in many text analysis pipelines. However, performance, especially in specialized domains, suffers from the presence of complex terminology. Our hypothesis is that including named entity annotations can improve the speed and quality of dependency parses. As part of BLAH5, we built a web service delivering improved dependency parses by taking into account named entity annotations obtained by third party services. Our evaluation shows improved results and better speed.

Subject(s)

Natural Language Processing

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL