Search papers, labs, and topics across Lattice.

Top-tier US AI research university. Strong in NLP, ML systems, and computer vision.
71
93
1
Reusing training data during inference can boost imitation learning performance by up to 46%, transforming how we approach generalization in AI systems.
Synthetic data can bootstrap NMT models for low-resource languages, but without authentic inputs, they risk overfitting to rigid structures and losing semantic depth.
Explicitly inferring mental states under uncertainty leads to more thoughtful dialogue than unrestricted information access, challenging conventional wisdom in AI interactions.
Torque Adaptation Module enables zero-shot robust manipulation across different robots without the need for extensive retraining or domain randomization.
Success in long-horizon tasks hinges more on an agent's iterative persistence than on the quality of its initial solution.
Imaginative Perception Tokens boost spatial reasoning in VLMs, achieving a 3.4% accuracy gain on Multiview Counting while outperforming traditional training methods.
Achieving a 3% boost in identity tracking accuracy in thermal video by focusing on trajectory relinking rather than complex models could redefine best practices in MOT.
Achieving a 4.2x latency reduction in long-form ASR without sacrificing accuracy could revolutionize real-time speech applications.
Steering imaginations in video world models can reveal critical failure points in robotic actions that traditional methods might overlook.
Recurrent memory can be added to transformers at scale with minimal parameter overhead and no performance penalty by reusing existing hidden states and training with interleaved parallel updates.
The best LLM to answer a question isn't always the best LLM to *teach* the answer, and matching the "difficulty" of the explanation to the student's current abilities yields better learning.
LLMs can resolve merge conflicts nearly as well as Google's best, but still fail in over 40% of cases, revealing a surprising bottleneck in automating software development.
Stop banning GenAI in STEM assessments: this framework shows you how to thoughtfully integrate it to actually *improve* learning outcomes.
Unlock the long tail of autonomous driving scenarios: Sensor2Sensor turns readily available dashcam footage into high-fidelity, multi-modal sensor data, bridging the gap between data scarcity and the need for robust AV training.
Ophthalmic VQA models can be made more accurate and transparent by explicitly grounding them in spatially-localized lesion evidence, a crucial step towards clinical interpretability.
Training a foundation model on a trillion minutes of wearable sensor data unlocks surprisingly accurate predictions across a wide range of health conditions, even with limited labeled data.
Distributional regret bounds, which quantify the probability of exceeding different regret levels, are now achievable with a UCBVI-style algorithm, confirming a long-standing conjecture for multi-armed bandits.
Forget Gaussian noise - modeling the *decay* of user interest with a custom "burn-down" diffusion process unlocks better personalized recommendations.
AI data annotation companies are publicly framing human expertise as a commodity ripe for disruption, potentially devaluing traditional forms of knowledge and institutional authority.
LLMs can't rebuild software from scratch, even for widely used programs like FFmpeg and SQLite, revealing a critical gap in their ability to make high-level software architecture decisions.
Open-sourcing a VLA model that beats closed-source giants on embodied reasoning tasks could finally make real-world robot deployment practical.
Unlock collaborative AI development in genomics without compromising patient privacy: this framework lets multiple institutions jointly train synthetic data generators on sensitive RNA-seq data using MPC and DP.
Traditional research papers are costing AI agents reproducibility and understanding, but a new "Agent-Native" format that captures the full messy research process boosts performance by up to 20%.
Multimodal models can now achieve state-of-the-art performance in real-world tasks like document understanding and audio-video comprehension with significantly reduced inference latency thanks to novel token-reduction techniques.
LLM agents struggle to maintain performance in multi-day collaborative tasks, dropping significantly after just one environmental update, revealing a critical gap in adaptation to evolving real-world conditions.
The fragmented field of world modeling can now be unified under a "levels x laws" taxonomy, revealing critical gaps in autonomous model revision and decision-centric evaluation.
A surprisingly simple tweak to Hartigan's k-means algorithm unlocks another 2-5% accuracy boost, especially when clustering high-dimensional data.
Modular training with BAR allows independent updates of domain experts, achieving superior performance without the pitfalls of catastrophic forgetting.
Kernel launch overhead is a bigger bottleneck than you think: GPUOS achieves up to 15.3x speedup by fusing operations at runtime.
RosettaSearch recovers up to 68% more structural fidelity in protein designs, transforming how we optimize sequences beyond traditional single-pass methods.
Geometric matrix interpolation reveals hidden common structures in multi-view data, offering a new lens for multi-manifold learning.
Massively multilingual NER just got easier: UNER v2 offers a standardized benchmark for evaluating LLMs across diverse languages.
LLMs are twice as likely as humans to repeat the same support tactic in a conversation, but a simple RL reward for tactic novelty can fix it.
Forget training on closed sets: WildDet3D leverages geometric cues and diverse prompts to achieve SOTA 3D object detection across 13.5K categories in the wild.
Achieving robust brain decoding across subjects without any retraining could revolutionize how we interpret neural signals in diverse populations.
Get 80% of your prompt length back without sacrificing accuracy using a diffusion-based pruning method that can mask multiple tokens at once.
MLLMs can be tricked into missing 90% of harmful content simply by encoding it in images that humans can easily read.
Forget scaling laws: a large VLM strategically paired with a smaller model's reasoning tokens can rival the performance of a much larger, monolithic model.
Serving both image and video diffusion models on the same hardware? GENSERVE's step-level preemption and dynamic resource allocation can boost your service level agreement (SLA) attainment by up to 44%.
Forget catastrophic forgetting: ProTPS leverages vision prototypes to guide text prompt learning, achieving near-upper-bound performance in continual learning scenarios.
Forget hand-designed agent communication topologies: Agent Q-Mix learns decentralized communication strategies that boost accuracy and token efficiency in LLM multi-agent systems.
Claims of quantum advantage in electronic structure calculations must now contend with DMRG benchmarks achieving CAS(89,102) on Fe$_5$S$_{12}$H$_4^{5-}$, pushing the boundaries of classical computation.
Generative multi-agent systems spontaneously exhibit collusion and conformity, mirroring societal pathologies, even without explicit programming and bypassing individual agent safeguards.
Today's best MLLMs are stumped by PerceptionComp, a new video reasoning benchmark where answering questions requires piecing together visual evidence across time and space.
Agentic search gets a meta-RL boost: MR-Search learns to self-reflect and adapt search strategies across episodes, significantly outperforming standard RL baselines.
AI interventions designed to combat ableism can backfire, as biased nudges were often rejected and increased negativity, while inclusive nudges proved more effective as scaffolding for learning.
AI can now (almost) write and direct Saturday Night Live.
LLM-powered VR guides for blind and low vision users are not just tools, but social actors, prompting users to give them nicknames and rationalize their mistakes when others are present.
See in the dark: Dark3R unlocks structure from motion at signal-to-noise ratios below -4dB, where existing methods completely break down.
Existing AI agent permissioning schemes are hard to compare, so this paper provides a formal foundation and reveals a fundamental conflict between training data confidentiality and agent completeness.
LLMs still struggle with factual accuracy in specialized medical domains like pancreatic cancer, with hallucination rates varying wildly and web search integration failing to guarantee better responses.
Learning robotic reward functions from a million trajectories reveals that comparing entire trajectories, not just individual frames, unlocks better generalization and learning from suboptimal data.
Forget full fine-tuning: this dynamic routing strategy lets you adapt dense retrieval to new domains while using just 2% of the parameters.
Hyperspectral video, previously limited by motion artifacts and poor photon utilization, now achieves real-time capture and improved fidelity thanks to active illumination and coded-exposure pixels.
No-regret learning in repeated Bertrand games can lead to surprisingly high prices, challenging classical game theory's low-price predictions.
Key contribution not extracted.
Unlock robot learning with hidden knowledge: TOPReward extracts surprisingly accurate task progress signals directly from VLM token probabilities, bypassing the need for explicit reward engineering.
Forget passively analyzing model outputs – this new attack actively *trains* the model to regurgitate specific texts, revealing its training data with surprising accuracy.
LMs can learn some human-like linguistic biases from synthetic data, but surprisingly fail to reproduce the strong object preference seen in differential argument marking across human languages.
Stop worrying about false positives: this watermarking scheme guarantees unforgeability and recoverability, ensuring content is linked exclusively to its generating model even under substitution attacks.
Forget RL fine-tuning: this paper shows you can beat it at cold-start personalization with a tiny model and clever Bayesian inference over structured preference priors.
Forget synthetic benchmarks that don't translate: MolmoSpaces offers 230k diverse, simulator-agnostic environments with 130k annotated objects, showing a remarkable 0.96 sim-to-real correlation for robot policies.
Open-weight coding agents can now be cheaply and rapidly specialized to private codebases, thanks to a new supervised finetuning method that slashes training costs by over 25x.
This study establishes SSL as a promising paradigm for ECG analysis, particularly in settings with limited annotated data, enhancing accessibility, generalizability, and fairness in AI-driven cardiac diagnostics across diverse clinical environments and questions.
Moxin 7B and its variants (VLM, VLA, Chinese) offer a new suite of fully transparent, open-source multimodal models, pushing beyond simple weight sharing to enable deeper customization and collaborative research.
Robots can now navigate more reliably and across different bodies (wheeled vs. legged) thanks to a hierarchical model that separates high-level planning from low-level physical constraints.
Open-source biomolecular modeling just got a boost: RF3 closes the gap with AlphaFold3 in structure prediction, thanks to the new AtomWorks data framework.
Robot foundation models can achieve state-of-the-art performance by explicitly reasoning about spatial plans as editable trajectory traces, rather than directly mapping perception to control.
Train better aligned LLMs with 10% of the data by strategically focusing on the most difficult preference comparisons.
Despite claims of safety alignment, state-of-the-art LLMs still spill the beans on hazardous scientific knowledge at an alarming rate, failing nearly 80% of the time on a new regulation-grounded benchmark.
Self-supervised learning beats supervised learning for ECG interpretation when labeled data is scarce, unlocking more robust and generalizable AI-driven cardiac diagnostics.