Search papers, labs, and topics across Lattice.
19
0
18
0
Targeted optimization in underperforming regions boosts document parsing accuracy to a record 96.33%, setting a new benchmark in the field.
Key contribution not extracted.
Mobile GUI agents are surprisingly susceptible to prompt injection via realistic, attacker-controlled text embedded within ordinary user-generated content, even without modifying the agent, application, or OS.
Forget specialized architectures: StepAudio 2.5 proves a single audio-language foundation, shaped by RLHF, can dominate ASR, TTS, and real-time dialogue simultaneously.
RAG systems can dynamically defend against retrieval corruption with a graph-based energy minimization approach, achieving superior robustness and response quality while slashing storage overhead.
VideoLLMs leak training data: a novel black-box attack recovers membership with surprisingly high accuracy (AUC=0.68) by probing generation brittleness across temperatures.
Decomposing robotic manipulation into coarse and fine-grained actions isn't just conceptually cleaner—it actually unlocks a sweet spot where learning difficulty is balanced, boosting performance.
FedLLMs, thought to be safer due to data localization, are shockingly vulnerable: a new attack achieves near 100% membership inference accuracy, even with differential privacy.
LLMs signal their internal certainty during answer decoding through predictable attention patterns on their own reasoning traces.
Achieve real-time, globally consistent, and photorealistic SLAM in large-scale environments by directly performing loop closure on optimized Gaussian maps.
VLA models can ace the task but still trigger unsafe outcomes, exposing a critical gap between action execution and semantic understanding.
Achieve robust SLAM in dynamic environments without semantic labels or depth sensors by disentangling scene dynamics with a generalizable motion model.
Navigate to objects forever: OVAL enables robots to continuously explore and remember new environments, unlocking truly lifelong object goal navigation.
Even GPT-5.4 can't handle investment banking tasks, failing nearly half the criteria and producing zero client-ready outputs on a new benchmark designed with 500+ bankers.
Incomplete trajectory data got you down? This plug-and-play framework progressively aligns features from incomplete observations with complete ones, boosting prediction accuracy in autonomous driving scenarios.
VLMs that ace digital document parsing benchmarks still stumble badly when faced with real-world scanned, warped, or photographed documents, revealing a significant "reality gap."
You can cut MLLM hallucinations in remote sensing tasks without any training by cleverly exploiting the model's own attention mechanisms to focus on relevant image regions.
LLM-powered pentesting agents fail not because of model limitations, but because they can't estimate task difficulty, leading to wasted effort and premature context exhaustion.
Forget sub-task prediction – the secret to better robot policies is reasoning directly in the action space with a sequence of coarse action intents.