Search papers, labs, and topics across Lattice.
School of Electrical Engineering and Computer Science, The University of Queensland
4
0
10
13
Forget external text corpora – this new method unlocks surprisingly effective sequential recommendations by cleverly routing and filtering token embeddings from multiple LLMs.
Seedance 2.0 leapfrogs existing models by unifying multi-modal inputs (text, image, audio, video) into a single architecture for generating high-quality, longer-duration audio-video content.
LRMs already know when to stop reasoning, but current sampling methods are holding them back.
Stop overfitting your reward model: R2M leverages real-time policy feedback to dynamically align the reward model with the evolving policy distribution, reducing reward overoptimization in RLHF.