Mar 5, 2026arXiv:2603.05462

NCTB-QA: A Large-Scale Bangla Educational Question Answering Dataset and Benchmarking Performance

Abrar Eyasir, Tahsinah Ahmed, Tahsin Ahmed, Muhammad Ibrahim

AI Summary

The authors introduce NCTB-QA, a new large-scale Bangla question answering dataset with a balanced distribution of answerable and unanswerable questions extracted from 50 textbooks. The dataset includes adversarially designed instances with plausible distractors to challenge reading comprehension systems. Benchmarking experiments with BERT, RoBERTa, and ELECTRA show that fine-tuning on NCTB-QA leads to substantial improvements in F1 score and BERTScore, highlighting the importance of domain-specific fine-tuning in low-resource settings.

Key Contribution

A new Bangla QA dataset with a high proportion of unanswerable questions exposes the fragility of current models in low-resource settings.

Abstract

Reading comprehension systems for low-resource languages face significant challenges in handling unanswerable questions. These systems tend to produce unreliable responses when correct answers are absent from context. To solve this problem, we introduce NCTB-QA, a large-scale Bangla question answering dataset comprising 87,805 question-answer pairs extracted from 50 textbooks published by Bangladesh's National Curriculum and Textbook Board. Unlike existing Bangla datasets, NCTB-QA maintains a balanced distribution of answerable (57.25%) and unanswerable (42.75%) questions. NCTB-QA also includes adversarially designed instances containing plausible distractors. We benchmark three transformer-based models (BERT, RoBERTa, ELECTRA) and demonstrate substantial improvements through fine-tuning. BERT achieves 313% relative improvement in F1 score (0.150 to 0.620). Semantic answer quality measured by BERTScore also increases significantly across all models. Our results establish NCTB-QA as a challenging benchmark for Bangla educational question answering. This study demonstrates that domain-specific fine-tuning is critical for robust performance in low-resource settings.

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References14

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

NCTB-QA: A Large-Scale Bangla Educational Question Answering Dataset and Benchmarking Performance

Related Papers