Search papers, labs, and topics across Lattice.
3
0
6
0
MoEs, despite their scaling advantages, suffer from a surprising "spectral plasticity loss" in continual RL, but a simple Parseval penalty can recover performance.
By actively pruning erroneous predictions from its memory graph, HGR achieves a 4.5x reduction in revisits to incorrect regions, demonstrating that retracting hypotheses is as important as generating them for long-horizon embodied navigation.
Forget hand-crafted reward functions: MVR uses multi-view video and a frozen VLM to automatically shape RL rewards, teaching agents complex motions without getting stuck on static poses.