Your browser doesn't support javascript.
loading
Bangla news article dataset.
Saad, Asif Mohammed; Mahi, Umme Niraj; Salim, Md Shahidul; Hossain, Sk Imran.
Afiliação
  • Saad AM; Khulna University of Engineering & Technology, Khulna 9203, Bangladesh.
  • Mahi UN; Khulna University of Engineering & Technology, Khulna 9203, Bangladesh.
  • Salim MS; Khulna University of Engineering & Technology, Khulna 9203, Bangladesh.
  • Hossain SI; Khulna University of Engineering & Technology, Khulna 9203, Bangladesh.
Data Brief ; 57: 110874, 2024 Dec.
Article em En | MEDLINE | ID: mdl-39290422
ABSTRACT
In this research, we present an updated standard Bangla dataset based on gathered Bangla news articles. In total, more than 1.9 million articles from nine Bangla news websites were gathered; the selection process was led by a number of categories, including sports, economy, politics, local news, tech, tourism, entertainment, education, health, the arts, and many more. The dataset per newspaper contains varying attributes, such as title, content, time, tags, meta, category, etc. This dataset will enable data scientists to investigate and assess theories related to Bangla natural language processing. Furthermore, there is a greater chance that the dataset will be utilized for domain-specific large language models in the context of Bangladesh, and it may be used to develop deep learning and machine learning models that categorize articles according to subjects.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Data Brief / Data in brief Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Bangladesh País de publicação: Holanda

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Data Brief / Data in brief Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Bangladesh País de publicação: Holanda