Search papers, labs, and topics across Lattice.
4
0
9
Achieve 75% input length reduction in LLMs with minimal performance loss by compressing token embeddings directly in the latent space.
Task-specific architectures still crush large vision-language models when it comes to predicting where surgical instruments should interact with tissue.
Unlock realistic OR video synthesis with a diffusion model conditioned on geometric abstractions, enabling controlled generation of rare and safety-critical events.
By learning to intelligently "zoom in" on relevant image regions, TikArt significantly boosts MLLM performance on fine-grained visual reasoning tasks.