Search papers, labs, and topics across Lattice.
7
0
12
0
LLMs that ace math and physics still struggle with general reasoning, achieving only 63% accuracy on a new K-12 level benchmark.
Surprisingly, general-purpose vision models already contain better action representations for robotic control than specialized embodied models trained explicitly for that purpose.
Differential privacy in language tasks is surprisingly cheap: approximate DP is free, and pure DP only reduces performance by a factor of $\min\{1,\varepsilon\}$.
LLMs can achieve massive performance gains on reasoning and knowledge-intensive tasks simply by iteratively refining their answers using pseudo-labels derived from unlabeled data.
Diffusion watermarks, thought to be robust, crumble under a simple stochastic resampling attack that breaks trajectory reconstruction.
LongCat-Next shatters the language-centric paradigm by unifying text, vision, and audio into a single autoregressive model with minimal modality-specific design, finally reconciling understanding and generation in discrete vision modeling.
Unlock more robust 3D multi-object tracking by fusing asynchronous LiDAR and camera data, achieving state-of-the-art results on nuScenes.