Addressing Issues of Cross-Linguality in Open-Retrieval Question Answering Systems For Emergent Domains
EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of System Demonstrations
; : 1-10, 2023.
Article
in English
| Scopus | ID: covidwho-20232037
ABSTRACT
Open-retrieval question answering systems are generally trained and tested on large datasets in well-established domains. However, low-resource settings such as new and emerging domains would especially benefit from reliable question answering systems. Furthermore, multilingual and cross-lingual resources in emergent domains are scarce, leading to few or no such systems. In this paper, we demonstrate a cross-lingual open-retrieval question answering system for the emergent domain of COVID-19. Our system adopts a corpus of scientific articles to ensure that retrieved documents are reliable. To address the scarcity of cross-lingual training data in emergent domains, we present a method utilizing automatic translation, alignment, and filtering to produce English-to-all datasets. We show that a deep semantic retriever greatly benefits from training on our English-to-all data and significantly outperforms a BM25 baseline in the cross-lingual setting. We illustrate the capabilities of our system with examples and release all code necessary to train and deploy such a system1 © 2023 Association for Computational Linguistics.
Artificial intelligence; Computational linguistics; COVID-19; Information retrieval; Semantics; Automatic filtering; Automatic translation; Cross-lingual; Emerging domains; Large datasets; Low-resource settings; Question answering systems; Retrieved documents; Scientific articles; Training data; Large dataset
Search on Google
Collection:
Databases of international organizations
Database:
Scopus
Type of study:
Randomized controlled trials
Language:
English
Journal:
EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of System Demonstrations
Year:
2023
Document Type:
Article
Similar
MEDLINE
...
LILACS
LIS