Search papers, labs, and topics across Lattice.
4
0
6
0
Forget expensive training: FlexMem unlocks SOTA long-video MLLM performance on a single GPU by cleverly mimicking human memory recall.
Forget finetuning a new LoRA for every character: EverTale introduces a single LoRA that adapts to *all* characters in a story, enabling continuous character customization with improved fidelity and efficiency.
Existing multimodal models struggle with multi-image reasoning, but a new benchmark and inference-time attention fix exposes and alleviates these shortcomings.
Steer frozen MLLMs to reason about specific image regions at test time, without any training, by optimizing visual prompts that guide cross-modal attention.