Search papers, labs, and topics across Lattice.
Xidian University
5
0
7
0
Y-BotFrame transforms quadruped robots into intelligent assistants that can understand and execute natural language commands in real-time.
AerialClaw empowers UAVs to autonomously interpret and adapt to missions in real-time, revolutionizing aerial operations beyond static command sequences.
Ditching modular architectures unlocks surprisingly competitive vision-language performance, proving that end-to-end pixel-to-word models can rival traditional approaches at scale.
Micro-expressions that look identical can convey opposite emotions, and MEDN teases apart motion and emotion cues to spot the difference.
By explicitly disentangling language into global context, spatial relations, and object attributes, ProVG achieves state-of-the-art remote sensing visual grounding, suggesting that fine-grained linguistic cues are key to unlocking performance in this domain.