ChongqingUniversity of LeicesterUQApr 20, 2026arXiv:2604.18200

Multi-LLM Token Filtering and Routing for Sequential Recommendation

Wuhan Chen, Xin Xia, Zongwei Wang, Wentao Li, Shane Culpepper

AI Summary

This paper investigates using LLM token embeddings directly within sequential recommendation systems, without relying on external textual corpora. They find that directly injecting token embeddings from a single LLM leads to unstable or limited gains due to semantic misalignment and insufficient task adaptation. To address this, they propose MLTFR, a Multi-LLM Token Filtering and Routing framework that filters task-relevant tokens and integrates multiple LLM token spaces using a Mixture-of-Experts architecture, achieving state-of-the-art performance.

Key Contribution

Forget external text corpora – this new method unlocks surprisingly effective sequential recommendations by cleverly routing and filtering token embeddings from multiple LLMs.

Abstract

Large language models (LLMs) have recently shown promise in recommendation by providing rich semantic knowledge. While most existing approaches rely on external textual corpora to align LLMs with recommender systems, we revisit a more fundamental yet underexplored question: Can recommendation benefit from LLM token embeddings alone without textual input? Through a systematic empirical study, we show that directly injecting token embeddings from a single LLM into sequential recommenders leads to unstable or limited gains, due to semantic misalignment, insufficient task adaptation, and the restricted coverage of individual LLMs. To address these challenges, we propose MLTFR, a Multi-LLM Token Filtering and Routing framework for corpus-free sequential recommendation. MLTFR follows an interaction-guided LLM knowledge integration paradigm, where task-relevant token embeddings are selected via user-guided token filtering to suppress noisy and irrelevant vocabulary signals. To overcome the limitations of single-LLM representations, MLTFR integrates multiple LLM token spaces through a Mixture-of-Experts architecture, with a Fisher-weighted semantic consensus expert to balance heterogeneous experts and prevent domination during training. By jointly filtering informative tokens and aggregating complementary semantic knowledge across multiple LLMs, MLTFR enables stable and effective utilization of LLM token embeddings without textual inputs or backbone modification. Extensive experiments demonstrate that MLTFR consistently outperforms state-of-the-art sequential recommendation baselines and existing alignment methods. Our code is available at: https://github.com/ccwwhhh/MLTFR.

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References44

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Multi-LLM Token Filtering and Routing for Sequential Recommendation

Related Papers