Search papers, labs, and topics across Lattice.
27
0
22
13
The new REO framework reveals that the true challenge in differential equation discovery lies not just in recovering equations, but in leveraging them to reshape scientific understanding.
Superficial reasoning in video temporal grounding can be transformed into high-quality, time-aware insights with the right optimization framework.
Text world models can transform LLM-based agents from reactive responders to proactive planners, fundamentally changing how they interact with complex environments.
Transforming the KV cache from a monolithic structure into a dynamic, head-aware system could revolutionize LLM serving efficiency and scalability.
MLLMs can be manipulated to produce harmful outputs from benign inputs, exposing a critical vulnerability in their safety mechanisms.
Achieving nearly 50% Recall@1 in video retrieval without any training marks a significant leap in efficiency and effectiveness for complex user queries.
MAAD not only automates architecture design but also enhances the quality of outputs through a collaborative agent framework and advanced LLM integration.
APEIRIA bridges the gap between interpretable neuro-symbolic reasoning and the flexibility of multi-modal language models, achieving superior performance in 3D spatial reasoning.
Finance LLM agents can now block unauthorized actions mid-trajectory without sacrificing performance, thanks to a novel inline safety harness that adaptively routes verification between lightweight and advanced LLM judges.
Disinformation detection gets a major upgrade with ExTax, a framework that doesn't just flag fake news, but explains *how* it manipulates you through persuasion, emotion, and narrative.
Counterintuitively, letting radar cardiac sensors learn to mimic ECGs first yields far better performance on downstream tasks like blood pressure regression and waveform segmentation than directly training on those tasks.
Solving Poisson equations just got faster and more stable: NPSolver trains neural operators without solution labels by iteratively refining predictions with preconditioned conjugate gradient steps.
Reconstructing high-fidelity 3D heart models from noisy radar data is now possible, thanks to a novel mesh deformation approach that leverages physics-informed learning.
Federated learning can overcome data silos, but struggles when clients have different label relationships; FedHarmony shows how to harmonize these differences, leading to better performance.
Code dataset watermarking gets a stealthy upgrade: PuzzleMark hides watermarks in variable names based on code complexity, making them nearly undetectable while guaranteeing perfect verification.
Today's best vision-language models are surprisingly bad at reading scientific figures, failing to match expert-level reasoning on a new benchmark of experimental images.
Forget fully connected relation graphs: CasLayout's sparse relation modeling unlocks enhanced controllability and realism in 3D indoor scene synthesis.
Simple, artist-friendly quad meshes can now be automatically generated on 3D shapes using a diffusion model trained on a continuous surface representation, sidestepping the complexity of discrete mesh optimization.
Today's best language models can barely make sense of your messy group chats and fragmented digital life, achieving only 19% accuracy on a new benchmark of real-world reasoning.
MLLMs are better at understanding videos than directly grounding text queries within them, and a self-correction training loop can close the gap.
RL fine-tuning of discrete diffusion models can be made dramatically more stable and effective by treating the final denoised sample as the action and reconstructing trajectories using the forward diffusion process.
LLMs disperse similar prompts instead of clustering them, leading to significant prompt sensitivity that challenges stability and reliability.
GitHub abuse is more widespread and varied than previously thought, demanding a unified detection approach to safeguard software supply chains.
Ditching caches for compiler-managed data streams, Li Auto's M100 architecture achieves higher utilization than GPUs on autonomous driving tasks, hinting at a new path for efficient AI inference.
LLMs still struggle to understand the meaning of common phrases, idioms, and compound words, revealing critical gaps in semantic reasoning.
Imagine creating high-fidelity, navigable 3D worlds from just a text prompt or a single image – HY-World 2.0 makes it a reality.
PID controllers can now be enhanced with adaptive, data-driven tuning directly from reinforcement learning, retaining their simplicity while improving performance in uncertain environments.