Search papers, labs, and topics across Lattice.
2
0
4
Fine-tuned vision language models can decode pedestrian crossing intentions with unprecedented accuracy by leveraging contextual cues like ego motion and eye gaze.
Unlock "white-box" reasoning in vision-language models: SegCompass's sparse autoencoder creates an interpretable bridge between visual perception and chain-of-thought, outperforming black-box alignment methods.