Search papers, labs, and topics across Lattice.
This paper introduces an agentic data augmentation method for Aspect-Based Sentiment Analysis (ABSA) that iteratively generates and verifies synthetic training examples using LLMs. The agentic approach improves label preservation in augmented data compared to a prompting-based baseline, especially for tasks involving aspect term generation. Experiments across three ABSA subtasks, four SemEval datasets, and two encoder-decoder models (T5-Base and Tk-Instruct) demonstrate that agentic augmentation, when combined with real data, consistently outperforms prompting-based generation, particularly for T5-Base.
Forget prompt engineering: agent-based LLM data augmentation preserves labels better and boosts ABSA performance, especially for smaller models.
We propose an agentic data augmentation method for Aspect-Based Sentiment Analysis (ABSA) that uses iterative generation and verification to produce high quality synthetic training examples. To isolate the effect of agentic structure, we also develop a closely matched prompting-based baseline using the same model and instructions. Both methods are evaluated across three ABSA subtasks (Aspect Term Extraction (ATE), Aspect Sentiment Classification (ATSC), and Aspect Sentiment Pair Extraction (ASPE)), four SemEval datasets, and two encoder-decoder models: T5-Base and Tk-Instruct. Our results show that the agentic augmentation outperforms raw prompting in label preservation of the augmented data, especially when the tasks require aspect term generation. In addition, when combined with real data, agentic augmentation provides higher gains, consistently outperforming prompting-based generation. These benefits are most pronounced for T5-Base, while the more heavily pretrained Tk-Instruct exhibits smaller improvements. As a result, augmented data helps T5-Base achieve comparable performance with its counterpart.