Search papers, labs, and topics across Lattice.
4
0
8
1
Mags-RL lets multimodal LLMs see the forest *and* the trees, using reinforcement learning to guide a super-resolution agent that selectively enhances image regions for improved reasoning without extra annotations.
VLMs can achieve state-of-the-art adversarial robustness by iteratively refining visual and textual representations through a closed-loop prompting mechanism, even with frozen encoders.
Speaker drift, a subtle but pervasive flaw in modern TTS, can now be automatically detected by prompting LLMs with geometric representations of speaker embeddings.
Standard upsampling methods in XAI systematically corrupt attribution signals, but a novel semantic-aware redistribution approach provably preserves attribution mass and improves explanation faithfulness.