Search papers, labs, and topics across Lattice.
5
0
7
LVLMs can achieve SOTA visual reasoning by learning to "see" in a way that optimizes for reasoning, even if it means deviating from strict geometric accuracy.
Achieve state-of-the-art aerial object detection by learning specialized representations across diverse datasets and within complex scenes, and even detect novel object categories without retraining.
Decoupling learning and memory lets models learn new tasks without catastrophic forgetting, outperforming standard regularization techniques.
Forget training wheels: DeepScan unlocks significant gains in LVLM visual reasoning *without* any additional training, achieving state-of-the-art results on V*.
User feedback can boost open-vocabulary object detection by nearly 8 AP on ambiguous examples, with only 30ms latency per interaction.