Search papers, labs, and topics across Lattice.
UniMixer is introduced as a unified architecture for recommendation systems that bridges attention-based, TokenMixer-based, and factorization-machine-based methods to improve scaling efficiency. The core innovation involves transforming rule-based TokenMixers into parameterized structures, enabling optimization of token mixing patterns during training and removing constraints on the number of heads. Experiments demonstrate UniMixer's superior scaling abilities, with a lightweight variant, UniMixing-Lite, further enhancing performance while reducing parameters and computational cost.
UniMixer achieves state-of-the-art scaling in recommendation systems by unifying disparate architectures into a single framework that learns optimal token mixing patterns.
In recent years, the scaling laws of recommendation models have attracted increasing attention, which govern the relationship between performance and parameters/FLOPs of recommenders. Currently, there are three mainstream architectures for achieving scaling in recommendation models, namely attention-based, TokenMixer-based, and factorization-machine-based methods, which exhibit fundamental differences in both design philosophy and architectural structure. In this paper, we propose a unified scaling architecture for recommendation systems, namely \textbf{UniMixer}, to improve scaling efficiency and establish a unified theoretical framework that unifies the mainstream scaling blocks. By transforming the rule-based TokenMixer to an equivalent parameterized structure, we construct a generalized parameterized feature mixing module that allows the token mixing patterns to be optimized and learned during model training. Meanwhile, the generalized parameterized token mixing removes the constraint in TokenMixer that requires the number of heads to be equal to the number of tokens. Furthermore, we establish a unified scaling module design framework for recommender systems, which bridges the connections among attention-based, TokenMixer-based, and factorization-machine-based methods. To further boost scaling ROI, a lightweight UniMixing module is designed, \textbf{UniMixing-Lite}, which further compresses the model parameters and computational cost while significantly improve the model performance. The scaling curves are shown in the following figure. Extensive offline and online experiments are conducted to verify the superior scaling abilities of \textbf{UniMixer}.