Search papers, labs, and topics across Lattice.
Shanghai Jiao Tong University
1
0
3
7
A single visual backbone can now handle perception, reconstruction, and action in streaming video, rivaling specialized models without task-specific fine-tuning.