Search papers, labs, and topics across Lattice.
12
0
14
Language-driven video generation in Qwen-RobotWorld achieves unprecedented accuracy in predicting robotic actions, outperforming existing models across key benchmarks.
FTP-1 not only excels on familiar tactile sensors but also achieves unprecedented success on unseen setups, redefining the potential for cross-sensor generalization in robotic manipulation.
MSRGC-Net achieves state-of-the-art clustering performance with drastically reduced computational overhead by eliminating the need for iterative training.
Efficient context handling in video tasks can elevate multimodal models to new heights of agency and reasoning capability.
Teacher privilege in multimodal reasoning is redefined, showing that visually grounded cues can lead to superior performance in on-policy distillation.
Future-L1 shows that preserving visual semantics in latent space can dramatically enhance video event prediction accuracy, outperforming previous models by substantial margins.
Rethinking few-step distillation reveals that the training pipeline's organization is as crucial as the distillation objectives themselves.
Existing text-to-image benchmarks miss the mark on real-world artistic creation, but Qwen-Image-Bench finally provides a creator-centric evaluation that reliably distinguishes state-of-the-art models.
Ditch the rigid grid: SP-MoMamba uses superpixels to let Mamba-based super-resolution models "see" images like humans do, boosting performance and efficiency.
Today's best AI agents can only solve 55% of real-world academic tasks that university students find challenging, revealing a significant gap between current AI capabilities and the demands of academic workflows.
Generalist robot policies can achieve 95% success rates on real-world manipulation tasks by continually learning from a fleet of robots, even in the face of distribution shifts and long-tail failures.
User-driven privacy ratings of mobile apps reveal significant discrepancies with expert assessments, suggesting a need for more inclusive and user-centric privacy evaluation mechanisms.