Apr 27, 2026arXiv:2604.24720

Sentiment and Emotion Classification of Indonesian E-Commerce Reviews via Multi-Task BiLSTM and AutoML Benchmarking

Hermawan Manurung, Hermawan Manurung, Ibrahim Al-Kahfi, Ibrahim Al-Kahfi, A. Rizqi, Ahmad Rizqi, Martin Clinton Tosima Manullang

AI Summary

This paper addresses the challenge of sentiment and emotion classification in Indonesian e-commerce reviews, which are rife with slang and non-standard language. They benchmark a multi-task BiLSTM architecture against AutoML techniques on the PRDECT-ID dataset, using a custom preprocessing pipeline with slang normalization. The BiLSTM model, particularly the "Improved" configuration, achieves competitive performance, demonstrating the effectiveness of deep learning for nuanced Indonesian text analysis.

Key Contribution

A BiLSTM with a custom slang dictionary rivals AutoML in classifying the sentiment and emotion of messy, real-world Indonesian e-commerce reviews.

Abstract

Indonesian marketplace reviews mix standard vocabulary with slang, regional loanwords, numeric shorthands, and emoji, making lexicon-based sentiment tools unreliable in practice. This paper describes a two-track classification pipeline applied to the PRDECT-ID dataset, which contains 5,400 product reviews from 29 Indonesian e-commerce categories, each labeled for binary sentiment (Positive/Negative) and five-class emotion (Happy, Sad, Fear, Love, Anger). The first track applies TF-IDF vectorization with a PyCaret AutoML sweep across standard classifiers. The second track is a PyTorch Bidirectional Long Short-Term Memory (BiLSTM) network with a shared encoder and two task-specific output heads. A preprocessing module applies 14 sequential cleaning steps, including a 140-entry slang dictionary assembled from marketplace corpora. Four configurations are benchmarked: BiLSTM Baseline, BiLSTM Improved, BiLSTM Large, and TextCNN. Training uses class-weighted cross-entropy loss, ReduceLROnPlateau scheduling, and early stopping. Both tracks are deployed as Gradio applications on Hugging Face Spaces. Source code is publicly available at https://github.com/ikii-sd/pba2026-crazyrichteam.

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References10

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Sentiment and Emotion Classification of Indonesian E-Commerce Reviews via Multi-Task BiLSTM and AutoML Benchmarking

Related Papers