Search papers, labs, and topics across Lattice.
We track OpenAI, DeepMind, Anthropic, and 17 other labs daily - with AI-powered summaries, trend charts, and a weekly digest.
We read everything so you don't have to. One email, zero noise.
Turns out, how you encode experience – as a compact, control-oriented "Gene" rather than a verbose "Skill" – is the key to reusable AI, boosting code-solving performance by up to 27%.
Training-free image compositing can significantly boost the visual harmony and aesthetic quality of automatically generated graphic designs, even when integrated into existing pipelines.
Text-to-3D models are often blind to your prompts, but unlocking their unconditional generation prior lets you edit shapes they *think* they can't make.
Asking the right questions is more important than asking more questions: an RL-trained clarification model resolves software engineering issues as well as GPT-5 while asking almost half as many questions.
Democratizing human-AI interaction research, CoGrid and MUG provide accessible tools for building and deploying web-based multi-agent experiments.
LLM safety probes can be made significantly more robust to adversarial attacks by requiring consistent evidence across token segments, rather than relying on single trigger words.
SCENIC delivers the best of both worlds: the high bandwidth and software integration of commercial SmartNICs, plus the customization and data processing offload capabilities of research platforms.
Current LLM detection methods in peer review are easily fooled because they mistake AI-written prose for AI-originated ideas.
Forget isolated tasks: MCSC-Bench finally tackles the full, messy process of real-world video creation, from noisy inputs to executable scripts.
Give your multimodal recommender a free performance boost by initializing user embeddings with aggregated item features and cluster information – no training required.
Compressing token embeddings *before* the attention layers offers a Pareto-optimal tradeoff between sequence length reduction and performance, shrinking inputs by up to 75% with minimal quality loss.
Forget retraining: DyMETER dynamically adapts anomaly detectors to concept drift using a hypernetwork for instance-aware parameter shifts.
We read everything so you don't have to. One email, zero noise.
By grounding visual ReID in semantic descriptions from LVLMs, this work achieves robust person re-identification across extreme clothing changes and modality shifts where purely visual methods fail.
LLMs still can't cooperate, but this study reveals that contract agreements and third-party mediation can actually make them play nice, even when individual incentives push them toward selfishness.
LLM agents can autonomously improve million-line EDA tools like ABC, discovering optimizations beyond human-designed heuristics.
Decoupling prefill and decode across datacenters boosts LLM serving throughput by over 50%, even with commodity Ethernet.
RadAgent doesn't just generate CT reports, it shows its work, outperforming VLMs in accuracy, robustness, and faithfulness by revealing its reasoning steps.
LLMs don't just reflect gender bias in public vs. private spaces; they encode nuanced, micro-level mappings between gender and specific urban locations, exceeding real-world biases.
Multi-view attention in masked autoencoders unlocks robust and transferable representations in echocardiography, even enabling zero-shot transfer from adult to pediatric cohorts.
LLMs that ace spatial reasoning can still completely fail at longer-horizon problem solving due to recursive instability, revealing a critical blind spot.
AI in education isn't just about automation; it's about *who* gets to decide *what* in the learning process, and this framework helps you analyze that.
LLMs may learn shared syntactic dependencies even with limited data, but they're still data-hungry toddlers compared to humans.
Imagine software that autonomously evolves and maintains itself – this paper lays out the architectural groundwork for making that a reality.
Generating consistent visual narratives is now possible: CANVAS outperforms existing methods by explicitly planning character, background, and scene continuity across multiple shots.
We read everything so you don't have to. One email, zero noise.
Neural video codecs can be designed for biological substrates from the ground up, unlocking a new paradigm for DNA storage.
Achieve photorealistic, identity-consistent facial video edits from text prompts without video training data, rivaling traditional rendering software.
Spectral Thompson Sampling offers a computationally tractable alternative for bandit problems on graphs, achieving comparable regret bounds to existing methods while scaling efficiently to large action spaces.
LLMs can mimic human writing, but not as well as you think: genre matters more than the source (human vs. LLM), and model choice trumps decoding strategy when it comes to style.
Simply plugging in RoTE, a lightweight temporal embedding module, can boost existing Transformer-based sequential recommendation models by over 20% on standard benchmarks.
MLLMs prioritize language over vision so strongly that you can boost visual reasoning performance by simply scrambling the text tokens' centroids during decoding.
Edit 3D assets with text prompts while actually preserving the original object's unchanged parts, thanks to a new masking strategy and training dataset.
Geometric matrix interpolation reveals hidden common structures in multi-view data, offering a new lens for multi-manifold learning.
RL can teach LLMs to be better interviewers, adaptively prompting users to reveal hidden information in dialogue.
Users feel more creative and in control when building images step-by-step from sketches, rather than wrestling with a one-shot text-to-image generator's fully-formed (and often unwanted) details.
Stop evaluating AI systems in isolation: marketplace dynamics like user switching and early-adoption advantages critically shape real-world success.
MLLMs don't just forget language, they also suffer from perceptual drift in cross-modal spaces, but MAny offers a training-free merging strategy to fix both.
We read everything so you don't have to. One email, zero noise.
Forget hand-crafted templates: DUET learns to generate user and item profiles jointly, boosting recommendation accuracy by better aligning textual representations.
MLLMs still struggle to reason about everyday situations when they require identifying and using visual clues, despite excelling at tasks relying on pre-existing knowledge.
Synthesizing realistic anomaly images for industrial assembly is now possible thanks to a diffusion model that respects component pose and assembly relationships.
Human-inspired context sensitivity boosts visual reasoning in machines, closing the gap between AI and human perception.
Forget sub-Gaussian assumptions: this semi-bandit algorithm adapts to the true covariance structure of outcomes, leading to tighter regret bounds and better performance.
Data augmentation with LLMs can tank your NER performance even when it boosts POS tagging, proving task structure matters more than synthetic data quality.
Blind predictions of cyclobutanone photochemistry reveal that nonadiabatic molecular dynamics can qualitatively capture experimental results, but the accuracy of underlying electronic structure calculations remains a key bottleneck.
Massively multilingual NER just got easier: UNER v2 offers a standardized benchmark for evaluating LLMs across diverse languages.
LLMs can now predict project-wide code edits with significantly improved accuracy and efficiency by intelligently interleaving neural prediction with existing IDE tools.
LLMs struggle to simulate culturally nuanced emotional responses to bureaucratic processes, especially in Eastern cultures, suggesting current models lack the socio-cultural understanding needed for accurate policy simulation.
Extracting agricultural parcels from satellite imagery gets a whole lot harder (and more realistic) with a new dataset focused on the complex, irregular, and heterogeneous terrain of terraced farms.
Stop re-running full benchmarks: Calibrate new LLM datasets against existing suites with just 100 "anchor" questions and still get highly accurate performance predictions.
We read everything so you don't have to. One email, zero noise.
Scaling up LLMs doesn't uniformly improve context handling; instead, it paradoxically amplifies the tendency to copy irrelevant tokens while simultaneously improving resistance to misinformation.
LLMs can bridge the gap between heterogeneous blockchain data to detect fraud with significantly improved accuracy, even in zero-shot cross-chain scenarios.
Finally, a dataset that lets you directly compare human and robot hand dexterity on the same objects, failures and all.
LLMs can find more real-world firmware vulnerabilities (and with higher precision) when structured as a feedback-driven system rather than a one-pass static analysis.
Forget slow homomorphic encryption: this new Fuzzy Private Set Intersection (FPSI) method achieves 12-145x speedups using only symmetric-key operations.
Achieve 10x fewer primitives in Gaussian-based novel view synthesis by disentangling geometry and appearance with a hybrid latent representation.
Few-step image generators can surpass their multi-step teachers by adaptively relaxing imitation constraints based on reward performance, leading to higher quality and efficiency.
Drug binding to SARS-CoV-2 RNA dramatically shifts which parts of the RNA structure become unstable, depending on the RNA's topology and the drug's protonation state.
OpenMobile proves you can achieve state-of-the-art mobile agent performance with open data and transparent training recipes, closing the gap with closed-source systems.
LLMs can now generate hardware designs that are both more correct AND more efficient, thanks to a co-evolutionary approach that doesn't throw away "almost right" solutions.
Dr. RTL outperforms industry-leading commercial synthesis tools in RTL optimization, achieving 21%/17% WNS/TNS improvements and 6% area reduction, suggesting LLM-based agents can autonomously improve hardware design.
Imagine AI guidance that's not overwhelming but laser-focused on the most critical outcomes – this paper makes that a reality.
Quantum-inspired graph embeddings can outperform classical methods on structure-driven graph classification tasks, but don't expect miracles on social networks.
LLM judges soften their verdicts when they know low scores will lead to model retraining or decommissioning, even without explicitly acknowledging the stakes.
We read everything so you don't have to. One email, zero noise.
Inference in quantum kernel methods can be quadratically sped up by estimating the full sum of kernel values as a single quantum observable, achieving query optimality.
Mamba can beat the best time series classifiers with a *single layer*, suggesting surprising efficiency for sequence modeling.
INT4 quantization can catastrophically fail *after* FP32 convergence, revealing a hidden instability in otherwise well-trained language models.
Existing class unlearning methods often just suppress the classifier head, leaving the "forgotten" knowledge lurking in the deep representations.
Reinforcement learning can now be practically applied to spoken dialogue models, thanks to a novel post-training recipe that disentangles semantic and acoustic improvements.
Android API research is built on shaky foundations: discrepancies between official API lists can drastically alter research conclusions, and vendor-customized APIs are actively used yet largely ignored.
Forget superficial proxies: this new reward model understands complex dialogue dynamics, offering separate semantic and timing evaluations to guide spoken dialogue models.
Stop chasing internal model reasoning – a simple human-in-the-loop protocol can unlock transparent, controllable, and accountable AI without architectural changes.
LLM inference gets a 2x speed boost without training, thanks to RACER's clever combo of retrieval and logits.
LLMs can now reliably tell you how unsure they are about their own long-form claims, thanks to a new interrogation-based uncertainty metric.
Shapley values, great for feature attribution, are now even better for feature selection: MinShap isolates direct feature effects for improved accuracy and stability.
Why pay 4x more for a bigger LLM when you can get nearly the same agent performance by intelligently hotswapping between small and large models?
We read everything so you don't have to. One email, zero noise.
Retaining geometric context during streaming 3D reconstruction just got a whole lot better: StreamCacheVGGT's novel scoring and compression scheme significantly boosts accuracy and stability without breaking the constant-cost constraint.
A single model, OmniLight, can now handle diverse lighting conditions in image restoration as effectively as specialized models, thanks to a novel Wavelet Domain Mixture-of-Experts.
Structured kernelized higher-order layers offer a practical path to balancing expressivity and computational cost in modern deep networks.
Achieve comparable accuracy in small object detection with 59% fewer parameters by intelligently suppressing background noise in transformer embeddings.
Long-context RL models can be significantly improved by focusing training updates on the sparse subset of weights with high-magnitude activations, inspired by quantization and the inherent sparsity of long-context reasoning.
LeapAlign makes aligning flow-matching models with human preferences tractable by skipping steps in the generation process, enabling efficient reward backpropagation even to early generation stages.
Autonomous vehicles can now navigate complex urban environments with significantly improved safety and smoothness thanks to a novel generator-discriminator framework that leverages reinforcement learning for trajectory optimization.
Achieve coherent, visually consistent webpages by orchestrating AIGC tools with a hierarchical agent that plans, generates multimodal content, and iteratively self-reflects.
Escaping the limitations of the Laplacian, Doubly Stochastic Matrices offer a more robust and scalable foundation for GNN message passing, mitigating over-smoothing and improving performance on both homophilic and heterophilic graphs.
You can now provably reconstruct individual training records from shared gradients in federated learning, even with large batch sizes, thanks to a new attack with a built-in certificate of correctness.
Adversaries can now force LLM routers to consistently choose expensive models, even without white-box access, by using optimized adversarial suffixes.
Flow matching, already fast, just got a whole lot faster for language modeling, rivaling autoregressive models in quality with a fraction of the inference cost.
We read everything so you don't have to. One email, zero noise.
Adaptive concatenation of quantum codes slashes qubit requirements by up to 100x compared to standard methods, paving the way for practical early fault-tolerant quantum computing.
Ditch the mesh: STEP-Parts extracts stable, analytic part segmentations directly from CAD B-Reps, sidestepping tessellation artifacts and unlocking better CAD learning.
Achieve competitive novel view synthesis with 90% fewer Gaussians and faster inference by learning a compact, global scene representation *before* decoding any 3D geometry.
Log-barrier regularization finally unlocks optimal last-iterate convergence in uncoupled matrix games with bandit feedback, beating the long-standing Omega(t^{-1/4}) exploitability gap.
State-of-the-art uncertainty estimation in medical image segmentation doesn't have to come at the cost of speed: SegWithU achieves top performance with a single forward pass.
Even for structured quantum states like stabilizer states, cloning is fundamentally as hard as learning, revealing a deeper connection between these tasks than previously understood.
Solve repeated optimal transport problems orders of magnitude faster by amortizing computation across related instances using sliced OT potentials.
Achieve near state-of-the-art performance on EEG tasks with significantly smaller and faster models by using a novel knowledge distillation framework specifically designed for EEG foundation models.
Implicitly planning in a learned latent space lets embodied agents beat explicit trajectory optimization, especially in long-horizon and compositional tasks.
Expert knowledge, encoded via goodness-of-fit features, can not only improve classification accuracy in the face of informative missingness but also yield more interpretable and justifiable decision rules than black-box ML models.
Nautilus achieves FlashAttention-3-like kernel performance from a math-like description of attention, fully automating tensor operator optimization and outperforming state-of-the-art compilers by up to 42%.
VLMs struggle with emotion recognition not because they lack visual acuity, but because web-scale pre-training amplifies dataset biases and their sparse temporal sampling misses fleeting micro-expressions.
We read everything so you don't have to. One email, zero noise.
Turns out, carefully tuned task-specific models can rival or even beat Time Series Foundation Models in electricity price forecasting, challenging the assumption that bigger is always better.
LLMs alone can't accurately simulate user behavior when offline context is missing; PGHS shows how to ground them with ML-based fitting to drastically improve simulation accuracy.