Search papers, labs, and topics across Lattice.
This paper introduces DiffCold, a novel diffusion-based generative model designed to tackle the cold-start item recommendation problem by effectively bridging the gap between warm and cold item representations. The authors identify the "seesaw dilemma" caused by distributional disparities between warm items, which have rich interaction histories, and cold items, which rely solely on content features. Through innovative techniques such as a Retrieval-enhanced Aggregator and a Simulation-based Representation Alignment module, DiffCold achieves superior performance across multiple benchmarks, resolving the seesaw dilemma and enhancing recommendation accuracy for both cold and warm items.
DiffCold shatters the seesaw dilemma in item recommendation, enabling accurate cold-start predictions without sacrificing the performance of warm items.
Cold-start item recommendation remains a persistent challenge in real-world systems due to the absence of interaction histories. While prior models attempt to bridge this gap using item content features, they universally suffer from the \textbf{seesaw dilemma}: enhancing performance for cold items inevitably degrades performance for warm items, and vice versa. We identify that this dilemma stems from a fundamental \textbf{distributional disparity}: warm item embeddings occupy a complex ``behavioral manifold"shaped by rich interaction signals, whereas cold item embeddings are constrained to a ``semantic manifold"derived solely from auxiliary content. Existing methods often force a rigid mapping between these inconsistent spaces, causing the model to sacrifice the precision of warm representations to accommodate cold ones. To address this, we propose \textbf{DiffCold}, a diffusion-based generative model that unifies warm and cold representations. Unlike GANs or VAEs, DiffCold leverages conditional diffusion to reconstruct warm item embeddings from content, preserving the underlying manifold structure without degradation. We further tailor this paradigm with two specific designs: a \textbf{Retrieval-enhanced Aggregator} that initializes generation using semantically similar warm items to bypass inefficient noise, and a \textbf{Simulation-based Representation Alignment} module that enforces distribution consistency between generated and real embeddings via contrastive learning. Experiments on three benchmarks confirm that DiffCold resolves the seesaw dilemma, consistently outperforming state-of-the-art methods across all metrics.