USTCFeb 26, 2026arXiv:2602.22632

Fine-grained Semantics Integration for Large Language Model-based Recommendation

Jiawen Feng, Xiaoyu Kong, Leheng Sheng, Bingzhe Wu, Chao Yi, Feifan Yang, Xiangrong Sheng, Han Zhu, Xiang Wang, Jiancan Wu, Xiangnan He

AI Summary

The paper introduces TS-Rec, a novel approach to enhance LLM-based recommendation systems by addressing the challenges of semantically meaningless SID initialization and coarse-grained alignment. TS-Rec incorporates Semantic-Aware embedding Initialization (SA-Init) using mean pooling of pretrained keyword embeddings and Token-level Semantic Alignment (TS-Align) to align SID tokens with item cluster semantics. Experiments on real-world datasets demonstrate that TS-Rec outperforms existing methods, highlighting the benefits of fine-grained semantic integration for LLM-based recommendation.

Key Contribution

LLM-based recommendation gets a boost: initializing item embeddings with semantic keywords and aligning tokens to item clusters significantly improves performance.

Abstract

Recent advances in Large Language Models (LLMs) have shifted in recommendation systems from the discriminative paradigm to the LLM-based generative paradigm, where the recommender autoregressively generates sequences of semantic identifiers (SIDs) for target items conditioned on historical interaction. While prevalent LLM-based recommenders have demonstrated performance gains by aligning pretrained LLMs between the language space and the SID space, modeling the SID space still faces two fundamental challenges: (1) Semantically Meaningless Initialization: SID tokens are randomly initialized, severing the semantic linkage between the SID space and the pretrained language space at start point, and (2) Coarse-grained Alignment: existing SFT-based alignment tasks primarily focus on item-level optimization, while overlooking the semantics of individual tokens within SID sequences.To address these challenges, we propose TS-Rec, which can integrate Token-level Semantics into LLM-based Recommenders. Specifically, TS-Rec comprises two key components: (1) Semantic-Aware embedding Initialization (SA-Init), which initializes SID token embeddings by applying mean pooling to the pretrained embeddings of keywords extracted by a teacher model; and (2) Token-level Semantic Alignment (TS-Align), which aligns individual tokens within the SID sequence with the shared semantics of the corresponding item clusters. Extensive experiments on two real-world benchmarks demonstrate that TS-Rec consistently outperforms traditional and generative baselines across all standard metrics. The results demonstrate that integrating fine-grained semantic information significantly enhances the performance of LLM-based generative recommenders.

Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References43

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Fine-grained Semantics Integration for Large Language Model-based Recommendation

Related Papers