Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Data Brief ; 54: 110407, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38708312

ABSTRACT

Mathematical entity recognition is essential for machines to define and illustrate mathematical substance faultlessly and to facilitate sufficient mathematical operations and reasoning. As mathematical entity recognition in the Bangla language is novel, to our best knowledge, there is no available dataset exists in any repository. In this paper, we present state of the art Bangla mathematical entity dataset containing 13,717 observations. Each record has a mathematical statement, mathematical type and mathematical entity. This dataset can be utilized to conduct research involving the recognition of mathematical operators, renowned mathematical terms (such as complex numbers, real numbers, prime numbers, etc.), and operands as numbers. The findings mentioned above, and their combination are also feasible with a modest tweak to the dataset. Furthermore, we have structured this dataset in raw format and made a CSV file, incorporating three columns: text, math entity, and label. As an outcome, researchers may easily handle the data, facilitating a variety of deep learning and machine learning explorations.

2.
Heliyon ; 10(3): e25467, 2024 Feb 15.
Article in English | MEDLINE | ID: mdl-38356580

ABSTRACT

Mathematical entity recognition is indispensable for machines to accurately explain and depict mathematical content and to enable adequate mathematical operations and reasoning. It expedites automated theorem proving, speeds up the analysis and retrieval of mathematical knowledge from documents, and improves e-learning and educational platforms. It also simplifies translation, scientific research, data analysis, interpretation, and the practical application of mathematical information. Mathematical entity recognition in the Bangla language is novel; to our best knowledge, no other similar works have been done. Here, we identify the mathematical operator, operands as numbers, and popular mathematical terms (complex numbers, real numbers, prime numbers, etc.). In this work, we recognize Bangla Mathematical Entity Recognition (MER) utilizing the ensemble architecture of deep neural networks known as Bidirectional Encoder Representations from Transformers (BERT). We prepare a novel dataset comprising 13,717 observations, each containing a mathematical statement, mathematical entity, and mathematical type. In our recognition process, we consider our proposed architectures using accuracy, precision, recall and f1-score as the performance metrics. The results have shown a satisfactory accuracy percentage of 97.98 with BERT and 99.76% with ensemble BERT.

3.
Data Brief ; 47: 108933, 2023 Apr.
Article in English | MEDLINE | ID: mdl-36819905

ABSTRACT

The popularity of reading comprehension (RC) is increasing day-to-day in Bangla Natural Language Processing (NLP) research area, both in machine learning and deep learning techniques. However, there is no original dataset from various sources in the Bangla language except translated from foreign RC datasets, which contain abnormalities and mismatched translated data. In his paper, we present UDDIPOK, a novel wide-ranging, open-domain Bangla reading comprehension dataset. This dataset contains 270 reading passages, 3636 questions, and answers from diverse origins, for instance, textbooks, exam questions from middle and high schools, newspapers, etc. Furthermore, this dataset is formated in CSV, which contains three columns: passages, questions, and answers. As a result, data can be handled expeditiously and easily for any machine learning research.

4.
Heliyon ; 8(10): e11052, 2022 Oct.
Article in English | MEDLINE | ID: mdl-36254291

ABSTRACT

Question answering (QA) system in any language is an assortment of mechanisms for obtaining answers to user questions with various data compositions. Reading comprehension (RC) is one type of composition, and the popularity of this type is increasing day by day in Natural Language Processing (NLP) research area. Some works have been done in several languages, mainly in English. In the Bangla language, neither any dataset available for RC nor any work has been done in the past. In this research work, we develop a question-answering system from RC. For doing this, we construct a dataset containing 3636 reading comprehensions along with questions and answers. We apply a transformer-based deep neural network model to obtain convenient answers to questions based on reading comprehensions precisely and swiftly. We exploit some deep neural network architectures such as LSTM (Long Short-Term Memory), Bi-LSTM (Bidirectional LSTM) with attention, RNN (Recurrent Neural Network), ELECTRA, and BERT (Bidirectional Encoder Representations from Transformers) to our dataset for training. The transformer-based pre-training language architectures BERT and ELECTRA perform more prominently than others from those architectures. Finally, the trained model of BERT performs a satisfactory outcome with 87.78% of testing accuracy and 99% training accuracy, and ELECTRA provides training and testing accuracy of 82.5% and 93%, respectively.

5.
PLoS One ; 16(8): e0253300, 2021.
Article in English | MEDLINE | ID: mdl-34370730

ABSTRACT

COVID-19 caused a significant public health crisis worldwide and triggered some other issues such as economic crisis, job cuts, mental anxiety, etc. This pandemic plies across the world and involves many people not only through the infection but also agitation, stress, fret, fear, repugnance, and poignancy. During this time, social media involvement and interaction increase dynamically and share one's viewpoint and aspects under those mentioned health crises. From user-generated content on social media, we can analyze the public's thoughts and sentiments on health status, concerns, panic, and awareness related to COVID-19, which can ultimately assist in developing health intervention strategies and design effective campaigns based on public perceptions. In this work, we scrutinize the users' sentiment in different time intervals to assist in trending topics in Twitter on the COVID-19 tweets dataset. We also find out the sentimental clusters from the sentiment categories. With the help of comprehensive sentiment dynamics, we investigate different experimental results that exhibit different multifariousness in social media engagement and communication in the pandemic period.


Subject(s)
COVID-19 , Public Health , Social Media , COVID-19/epidemiology , Cluster Analysis , Humans , Pandemics
SELECTION OF CITATIONS
SEARCH DETAIL
...