Your browser doesn't support javascript.
Phenonizer: A fine-grained phenotypic named entity recognizer for Chinese clinical texts
2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021 ; : 3963-3970, 2021.
Article in English | Scopus | ID: covidwho-1722891
ABSTRACT
Biomedical named entity recognition from clinical texts is a fundamental task for clinical data analysis due to the availability of large volume of electronic medical record data, which are mostly in free text format, in real-world clinical settings. Clinical text data incorporates significant phenotypic medical entities, which could be used for profiling the clinical characteristics of patients in specific disease conditions. However, general approaches mostly rely on the coarse-grained annotations (e.g. mentions of symptom terms) of phenotypic entities in benchmark text dataset. Owing to the numerous negation expressions of phenotypic entities (e.g. 'no fever', 'no cough' and 'no hypertension') in clinical texts, this could not feed the subsequent data analysis process with well-prepared structured clinical data. Thus, we constructed a fine-grained Chinese clinical corpus. Thereafter, we proposed a phenotypic named entity recognizer (Phenonizer). The results on the test set show that Phenonizer outperform those methods based on Word2Vec with Fl-score of 0.896. By comparing character embeddings from different data, it is found that character embeddings trained by clinical corpora can improve F-score by 0.0103. Furthermore, the fine-grained dataset enables methods to distinguish between negated symptoms and presented symptoms, and avoids the interference of negated symptoms. Finally, we tested the generalization performance of Phenonier, achieving a superior F1-score of 0.8389. In summary, together with fine-grained annotated benchmark dataset, Phenonier proposes a feasible approach to effectively extract symptom information from Chinese clinical texts with acceptable performance. © 2021 IEEE.
Keywords

Full text: Available Collection: Databases of international organizations Database: Scopus Type of study: Prognostic study Language: English Journal: 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021 Year: 2021 Document Type: Article

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: Databases of international organizations Database: Scopus Type of study: Prognostic study Language: English Journal: 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021 Year: 2021 Document Type: Article