Search papers, labs, and topics across Lattice.
This paper explores customer satisfaction prediction in Indonesian e-commerce using YouTube comments, addressing the challenge of analyzing large volumes of unstructured text. The study employs XGBoost with TF-IDF vectorization on a dataset of YouTube comments from e-commerce review videos, preprocessed to generate numerical features. Results show that a PyCaret-optimized XGBoost model achieves strong classification performance, while feature importance analysis reveals the influence of socio-political terminology on sentiment polarity in e-commerce discourse.
E-commerce sentiment analysis is surprisingly influenced by socio-political terminology, impacting the accuracy of customer satisfaction prediction models.
The exponential expansion of digital commerce in Indonesia has significantly shifted consumer interactions toward video-centric social networks, particularly YouTube. Consequently, the sheer volume of unstructured, multi-contextual comments poses a tremendous challenge for manual sentiment tracking. This study investigates and constructs a predictive model for customer satisfaction leveraging the Extreme Gradient Boosting (XGBoost) architecture coupled with Term Frequency-Inverse Document Frequency (TF-IDF) vectorization. By utilizing a secondary dataset of YouTube comments retrieved from e-commerce review videos, the raw text underwent rigorous preprocessing to generate normalized numerical features. The experimental results demonstrate that the PyCaret-optimized machine learning framework delivers superior classification resilience. Beyond standard performance metrics, lexical evaluations and feature-importance mapping uncover a notable phenomenon: e-commerce discourse is heavily infiltrated by socio-political terminologies, which ultimately influence the polarity of audience satisfaction.