Search papers, labs, and topics across Lattice.
2
0
5
Current multimodal browsing agents are surprisingly bad at using visual information on webpages, with even top models scoring below 50% accuracy on a new visual-native search benchmark.
Diffusion models can now generate more realistic and semantically appropriate hand grasps by explicitly modeling affordances and interaction semantics, outperforming prior methods on grasp quality, semantic accuracy, and diversity.