RoBERTa: language modelling in building Indonesian question-answering systems

Suwarningsih, Wiwin, Aditya Pratama, Raka, Yusuf Rahadika, Fadhil, Havid Albar Purnomo, Mochamad

Suwarningsih, Wiwin, Aditya Pratama, Raka, Yusuf Rahadika, Fadhil, Havid Albar Purnomo, Mochamad.

TELKOMNIKA ; 20(6):1248-1255, 2022.

Article in English | ProQuest Central | ID: covidwho-2080976

ABSTRACT

ABSTRACT

This research aimed to evaluate the performance of the A Lite BERT (ALBERT), efficiently learning an encoder that classifies token replacements accurately (ELECTRA) and a robust optimized BERT pretraining approach (RoBERTa) models to support the development of the Indonesian language question and answer system model. The two problems above, namely sorting candidate documents and validating answers have been handled by several methods such as the application of long-short term memory-recurrent neural network (LSTM-RNN) [12], template convolutional recurrent neural network (T-CRNN) [13], CNN-BiLSTM [14], dynamic co-attention networks (DCN) [15]. [...]section 4 presents the conclusion of the paper. Based on the proposed method in Figure 1, the article about coronavirus disease 2019 (COVID'19) news (we got it from crawling results on Indonesian Wikipedia, Jakarta News, Okezone, Antara, Kumparan, Tribune, and Open Super-large Crawled ALMAnaCH coRpus (OSCAR)) which is the input data for our study in preprocessing and converting the format to be used as input data for our study as a knowledge base system.

Keywords

Technology: Comprehensive Works

Fulltext

XML

Search on Google

Full text: Available Collection: Databases of international organizations Database: ProQuest Central Language: English Journal: TELKOMNIKA Year: 2022 Document Type: Article

Similar

MEDLINE

LILACS

LIS

Fulltext

XML

Search on Google