Search papers, labs, and topics across Lattice.
This paper introduces HRL4PFG, a hierarchical reinforcement learning framework designed to proactively guide user preferences towards long-tail items in interactive recommendation systems. The framework uses a macro-level process to generate fairness-guided targets based on multi-step feedback and a micro-level process for real-time recommendation fine-tuning. Experiments demonstrate that HRL4PFG significantly improves cumulative interaction rewards and user interaction length compared to existing methods.
Solve the cold-start problem for long-tail items in recommender systems by proactively shaping user preferences, rather than just boosting item exposure.
Item-side fairness is crucial for ensuring the fair exposure of long-tail items in interactive recommender systems. Existing approaches promote the exposure of long-tail items by directly incorporating them into recommended results. This causes misalignment between user preferences and the recommended long-tail items, which hinders long-term user engagement and reduces the effectiveness of recommendations. We aim for a proactive fairness-guiding strategy, which actively guides user preferences toward long-tail items while preserving user satisfaction during the interactive recommendation process. To this end, we propose HRL4PFG, an interactive recommendation framework that leverages hierarchical reinforcement learning to guide user preferences toward long-tail items progressively. HRL4PFG operates through a macro-level process that generates fairness-guided targets based on multi-step feedback, and a micro-level process that fine-tunes recommendations in real time according to both these targets and evolving user preferences. Extensive experiments show that HRL4PFG improves cumulative interaction rewards and maximum user interaction length by a larger margin when compared with state-of-the-art methods in interactive recommendation environments.