Search papers, labs, and topics across Lattice.
This paper addresses the problem of missing modalities in multimodal recommender systems by formalizing it as a graph feature interpolation problem on the item-item co-purchase graph. They propose four training-free graph-based imputation methods that propagate available multimodal features to impute missing ones, avoiding the common practice of dropping items with missing modalities. Experiments on multimodal recommendation datasets demonstrate that these methods improve performance compared to traditional imputation techniques and preserve or widen the performance gap between multimodal and traditional RSs.
Ditch the data drop: training-free graph imputation lets multimodal recommender systems handle missing data better than ever, boosting performance without retraining.
Multimodal recommender systems (RSs) represent items in the catalog through multimodal data (e.g., product images and descriptions) that, in some cases, might be noisy or (even worse) missing. In those scenarios, the common practice is to drop items with missing modalities and train the multimodal RSs on a subsample of the original dataset. To date, the problem of missing modalities in multimodal recommendation has still received limited attention in the literature, lacking a precise formalisation as done with missing information in traditional machine learning. In this work, we first provide a problem formalisation for missing modalities in multimodal recommendation. Second, by leveraging the user-item graph structure, we re-cast the problem of missing multimodal information as a problem of graph features interpolation on the item-item co-purchase graph. On this basis, we propose four training-free approaches that propagate the available multimodal features throughout the item-item graph to impute the missing features. Extensive experiments on popular multimodal recommendation datasets demonstrate that our solutions can be seamlessly plugged into any existing multimodal RS and benchmarking framework while still preserving (or even widen) the performance gap between multimodal and traditional RSs. Moreover, we show that our graph-based techniques can perform better than traditional imputations in machine learning under different missing modalities settings. Finally, we analyse (for the first time in multimodal RSs) how feature homophily calculated on the item-item graph can influence our graph-based imputations.