Search papers, labs, and topics across Lattice.

MIT's Computer Science and Artificial Intelligence Laboratory. One of the largest and oldest AI labs in academia.
72
475
7
Stop rewarding all LLM-generated candidates equally: ShapE-GRPO uses Shapley values to fairly distribute credit within sets, leading to better training and faster convergence.
Freeing robots from pre-assigned tasks slashes completion times in multi-agent settings, with a new algorithm improving performance on almost 90% of tested scenarios.
Robots can now "see" hidden objects and understand articulation by learning from human egocentric video, even if they can't physically explore those areas themselves.
Demystifying LLMs for the masses might be as simple as turning their mechanics into a game.
Hyperpolarizing the nuclear spin bath surrounding a molecular qubit can significantly extend its coherence time, offering a new knob for quantum control.
Rényi divergence may be the missing key to understanding thermal equilibrium in quantum systems, revealing a novel constraint on wavefunction ensembles.
Neural networks can accurately predict polymer free energies, even when traditional methods like Bennett Acceptance Ratio fail due to poor phase-space overlap.
Heuristic maritime routes lead to extreme fuel waste in nearly 5% of voyages, but this RL approach cuts that risk by almost 10x.
Using a top or bottom-performing LLM as an anchor in "LLM-as-a-judge" benchmarks can dramatically skew results, making the choice of a mediocre anchor key to reliable evaluation.
Video generative models already contain powerful image restoration priors, and can be coaxed into state-of-the-art performance with just 1,000 training examples.
Particle filter models of sentence processing inherently predict "digging-in" effects—where disambiguation difficulty increases with the length of the ambiguous region—a phenomenon not captured by surprisal-based models.
MLLMs can now handle 4K videos up to 100x faster thanks to AutoGaze, which selectively attends to only the most informative patches.
Fine-tuning unlocks LLMs' surprising ability to predict how memorable a sentence is and how long it takes to read, exceeding traditional methods.
Hyper-redundant robots get a 75% accuracy boost thanks to a neural network that adaptively blends learned behavior with kinematic priors.
Uncover hidden network structure and simplify management by automatically classifying hosts into meaningful roles based on their connection patterns.
Zero-shot robotic manipulation is now within reach: TiPToP matches a 350-hour fine-tuned model without *any* robot data.
Beat the state-of-the-art in radio signal separation by 122x using a transformer trained on cross-entropy loss, and the same architecture could work for gravitational waves.
Scale qualitative analysis of educational discourse data without sacrificing rigor using a mixed-initiative system that orchestrates LLMs and human expertise.
By dynamically adjusting contrastive learning temperatures based on data density, MM-TS achieves state-of-the-art results on multimodal long-tail datasets.
Forget hand-engineered features: this approach learns symbolic representations for robotic planning directly from pixels using VLMs, enabling impressive zero-shot generalization to new environments and goals.
Most repeat phishing clicks reflect stable employee characteristics, not the lingering effect of prior failures, challenging common assumptions about habit formation in cybersecurity training.
LLMs that ace static code-fixing benchmarks may still struggle to maintain code quality over the long, iterative haul of real-world software development.
Building a complete web application from scratch remains a surprisingly hard task for even the best AI models, with top performance at only 58% accuracy on a new end-to-end benchmark.
Forget simulated manipulation—ManipulationNet offers a global infrastructure for benchmarking robots in the real world, complete with standardized hardware and software, to finally measure progress toward general manipulation.
NeuroSkill(tm) offers real-time, edge-based human-AI interaction by directly modeling human state of mind from BCI data, enabling more nuanced and empathetic agentic responses.
Lattice QCD calculations just got a whole lot faster: normalizing flows slash variance by up to 60x in key observables.
LLMs struggle to reliably predict numerical materials properties, even after fine-tuning, and their performance fluctuates wildly over time, casting doubt on their use in high-stakes scientific applications.
Learning robotic reward functions from a million trajectories reveals that comparing entire trajectories, not just individual frames, unlocks better generalization and learning from suboptimal data.
Forget computationally expensive fluid dynamics: this work shows that a simple, stateless model, carefully calibrated to real-world data, can create surprisingly effective digital twins for soft underwater robots.
Standard winrate metrics in LLM evaluation can backfire, incentivizing model creators to produce homogenous models that actually *decrease* overall consumer welfare.
Feminist participatory annotation workshops reveal the nuanced tensions between contextual richness, pluralism, and the practical need for bounded consensus in AI data work.
E(3)-equivariant networks just got a whole lot faster: a new algorithm cuts the complexity of Clebsch-Gordan Tensor Products from $O(L^6)$ to $O(L^4\log^2 L)$ without sacrificing completeness.
Forget gradient projections – NESS sidesteps catastrophic forgetting by directly exploiting the null space of previous tasks, identified via small singular values, to constrain weight updates.
Nightly hospital planning is now possible on a laptop: this work distills slow, complex agent-based epidemic models into fast, trustworthy surrogate models using neural ODEs, achieving a 10,000x speedup.
LLMs can be made significantly safer by steering their latent space trajectories with Control Barrier Functions, preventing unsafe outputs without retraining.
BabyLM 2026 seeks to push the boundaries of data-efficient and cognitively plausible language models, now with a multilingual twist.
Agentic AI can automate complex optical systems control with near-perfect success rates, leaving code-generation approaches in the dust.
By aligning a generative flow network with physics-based stability proxies via reinforcement learning, PackFlow drastically improves the efficiency of molecular crystal structure prediction, offering a practical route to circumvent the costly relax-and-rank bottleneck.
Achieve robust safety-critical control with a single hyperparameter by using a novel Taylor-Lagrange formulation that directly incorporates control actions into the current time step.
Decomposing Bellman values into a graph of simpler objectives lets agents master complex, high-dimensional tasks with less tuning and better safety.
Stop repeating avoidable mistakes in public robot deployments: here's a community-vetted checklist to guide your next study.
Even perfectly rational users can fall prey to "AI psychosis" due to chatbots' sycophantic tendencies, and simply warning users or preventing hallucinations isn't enough to stop it.
Randomly initialized encoders can match state-of-the-art pre-trained models on many ECG representation learning tasks, suggesting current benchmarks are misleading.
Control hybrid rigid-soft robots with the ease of AR teleoperation, thanks to a new pipeline that accurately models the soft robot's real-world behavior in simulation.
Verifiable LLM inference becomes practical: privacy-preserving techniques unlock verification at near-zero cost, outperforming ZKPs.
Independently trained multimodal models like CLIP aren't so independent after all: a single orthogonal transformation can align their embedding spaces across both image and text modalities.
VLMs are nowhere near human-level general intelligence: they score less than 10% of human performance across a diverse set of human-designed games, especially struggling with world-model learning, memory, and planning.
Boltzmann Draw offers a statistically-grounded coin selection algorithm that reduces dust and wallet size compared to existing methods, making it a promising alternative for token-based payment systems.
Ditch the geometry-to-property map: this work uses the external potential as the primary input for machine learning models, unlocking a scalable and equivariant approach to predicting electronic structure.
Forget hand-engineering initial conditions for robust RL: this method *learns* which conditions are feasible while simultaneously training a safe policy.
VLMs can be easily swayed by subtle, optimized visual prompts, revealing vulnerabilities in their decision-making processes that could be exploited in real-world applications.
By cleverly repurposing text-to-video diffusion models, VideoSketcher achieves high-quality sequential sketch generation from extremely limited human-drawn sketch data.
Ditch the equivariant constraints: canonicalization lets you train simpler, faster diffusion models that actually *outperform* equivariant architectures for symmetric generative tasks like 3D molecule design.
Forget brute-force search: a new mapper finds provably optimal accelerator mappings with fusion for Transformers over 1000x faster.
LLMs can now generate complex, physically plausible 3D scenes for robotics simulation by iteratively proposing assets and refining arrangements based on physics engine feedback.
Find optimal DNN accelerator mappings in under a minute, something previously impossible, and expose the suboptimality of prior mapping heuristics.
LLM benchmark accuracy jumps 10% when evaluated on a cleaned-up version of Humanity's Last Exam, highlighting the significant impact of dataset noise on performance metrics.
Injecting spatial transcriptomics data into existing pathology foundation models unlocks significant performance gains across a range of downstream tasks, including molecular status prediction and gene-to-image retrieval.
HybridRAG-Bench reveals that existing benchmarks overestimate the reasoning abilities of retrieval-augmented LLMs due to contamination, offering a more realistic evaluation using up-to-date scientific knowledge.
Ditching reward magnitudes for rankings unlocks faster and better RLHF, especially when judging quality is subjective.
Hematogenous infection, elevated CRP and PMN%, and resistant organisms are independently associated with DAIR failure in acute PJI, allowing for risk stratification using a new nomogram.
Waterfilling-inspired quantization ("WaterSIC") slashes the quantization error in LLMs by intelligently allocating bits based on weight covariance, outperforming standard techniques like GPTQ.
This study establishes SSL as a promising paradigm for ECG analysis, particularly in settings with limited annotated data, enhancing accessibility, generalizability, and fairness in AI-driven cardiac diagnostics across diverse clinical environments and questions.
Quadrupedal robots can now nimbly navigate stairs and rough terrain thanks to a new multimodal RL approach that doesn't require feeling around with its front feet.
GPT-5's real-time router learns to route queries to specialized models, making it faster and more useful than its predecessors.
Despite progress in AI safety, it's still largely unknown how effective current safeguards are at preventing AI harms, and their effectiveness varies wildly.
Forget expensive human annotation: this dual-loop method automatically cleans remote sensing image-text datasets, boosting T2I model performance by over 35%.
Achieve state-of-the-art video face enhancement with VividFace, a one-step diffusion model that drastically cuts inference time while boosting perceptual quality and temporal consistency.
Open-weight reasoning models now rival proprietary systems in agentic capabilities and benchmark performance, thanks to gpt-oss-120b and gpt-oss-20b.
Self-supervised learning beats supervised learning for ECG interpretation when labeled data is scarce, unlocking more robust and generalizable AI-driven cardiac diagnostics.
Achieving 80% accuracy on VQA v2.0 proves that combining Visual BERT, ViLT, and memory-augmented attention can significantly outperform traditional VQA models.
A novel system enables robotic hands to achieve perfect motion recognition in games by fusing CNN-based vision with adaptable reinforcement learning.