Multi-label topic classification for COVID-19 literature annotation using an ensemble model based on PubMedBERT

Shubo Tian

Este artigo é um Preprint

Preprints são relatos preliminares de pesquisa que não foram certificados pela revisão por pares. Eles não devem ser considerados para orientar a prática clínica ou comportamentos relacionados à saúde e não devem ser publicados na mídia como informação estabelecida.

Preprints publicados online permitem que os autores recebam feedback rápido, e toda a comunidade científica pode avaliar o trabalho independentemente e responder adequadamente. Estes comentários são publicados juntamente com os preprints para qualquer pessoa ler e servir como uma avaliação pós-publicação.

Multi-label topic classification for COVID-19 literature annotation using an ensemble model based on PubMedBERT

Shubo Tian.

Afiliação

Shubo Tian; Florida State University

Preprint em Inglês | bioRxiv | ID: ppbiorxiv-465946

ABSTRACT

ABSTRACT

The BioCreative VII Track 5 calls for participants to tackle the multi-label classification task for automated topic annotation of COVID-19 literature. In our participation, we evaluated several deep learning models built on PubMedBERT, a pre-trained language model, with different strategies addressing the challenges of the task. Specifically, multi-instance learning was used to deal with the large variation in the lengths of the articles, and focal loss function was used to address the imbalance in the distribution of different topics. We found that the ensemble model performed the best among all the models we have tested. Test results of our submissions showed that our approach was able to achieve satisfactory performance with an F1 score of 0.9247, which is significantly better than the baseline model (F1 score 0.8678) and the mean of all the submissions (F1 score 0.8931).

Licença

cc_by_nc_nd

Texto completo

Adicionar na Minha BVS

Imprimir

XML

Buscar no Google

Texto completo: Disponível Coleções: Preprints Base de dados: bioRxiv Tipo de estudo: Experimental_studies / Estudo prognóstico Idioma: Inglês Ano de publicação: 2021 Tipo de documento: Preprint

Texto completo

Adicionar na Minha BVS

Imprimir

XML

Buscar no Google