Search papers, labs, and topics across Lattice.
RecGPT-Mobile introduces a framework for on-device next-query prediction in mobile e-commerce by deploying a lightweight LLM directly on user devices. This approach leverages LLMs for real-time user intent understanding to improve recommendation accuracy. Online experiments on Taobao demonstrate significant improvements in recommendation quality compared to cloud-based alternatives.
On-device LLMs can now drive real-time recommendation improvements, unlocking faster adaptation to evolving user intent without cloud reliance.
Predicting a user's next search query from recent interaction behaviors is a critical problem in modern e-commerce systems, particularly in scenarios where user intent evolves rapidly. Large Language Models (LLMs) offer strong semantic reasoning capabilities and have recently been adopted to enhance training data construction for next-query prediction. However, due to resource constraints on mobile devices, existing applications are deployed on cloud servers, resulting in high inference costs. In this paper, we propose RecGPT-Mobile, a framework that designs a lightweight LLM-based intent understanding agent to improve recommendation quality in mobile e-commerce scenarios. By deploying LLMs directly on mobile devices, our approach can capture evolving interests of users more quickly and adjust the recommendation results in real time. Extensive offline analyses and online experiments demonstrate that our method significantly improves the accuracy of recommendation results, laying a practical path for LLM deployment in production-scale recommendation systems on mobile devices, as well as a scalable solution for integrating LLMs into real-world next-query prediction systems.