Search papers, labs, and topics across Lattice.
NLPR & MAIS, Institute of Automation, Chinese Academy of Sciences, School of Artificial Intelligence, University of Chinese Academy of Sciences
7
1
12
2
Test-time RL's vulnerability to noisy pseudo-labels is amplified by group-relative advantage estimation, but can be mitigated with a surprisingly simple debiasing and denoising approach.
Current audio-language models are surprisingly bad at controlling and interpreting subtle vocal cues, failing in nearly half of situational dialogue scenarios.
EVT achieves 86.6% top-1 accuracy on ImageNet-1k without extra training data, redefining the potential of Vision Transformers in computer vision.
Overconfident tokens, often missed by entropy-based methods, carry surprisingly dense corrective signals in on-policy distillation, allowing for near-baseline performance with <10% of tokens.
Robots can now learn contact-rich manipulation skills like humans by feeling the forces involved, thanks to a new multimodal interface that captures synchronized visual, tactile, and force data.
A principled framework for General World Models reveals the limitations of current systems and the architectural requirements for future progress.
Overconfident errors in RLVR monopolize probability mass and suppress exploration, but a confidence-aware penalty fixes this and boosts mathematical reasoning performance.