Search papers, labs, and topics across Lattice.
We track OpenAI, DeepMind, Anthropic, and 17 other labs daily - with AI-powered summaries, trend charts, and a weekly digest.
We read everything so you don't have to. One email, zero noise.
Generative recommendation models can adapt to evolving user behavior without catastrophic forgetting by selectively updating item tokens based on a novel drift-detection mechanism.
GPT-5 can only solve 37% of PhD-level 3D geometry coding problems, suggesting AI can't reliably automate complex scientific coding tasks yet.
Achieve HPC acceleration by emulating FP64 operations with INT8 precision on GPUs, proving that you can boost performance *and* accuracy.
Stop training your image restoration models to mimic flawed ground truth; instead, explicitly optimize for perceptual quality using a plug-and-play module guided by No-Reference Image Quality Assessment.
Current multimodal dialogue models struggle to capture the nuanced expressiveness of human interaction, but a new dataset and benchmark reveal exactly where they fall short.
Trust in tree ensembles hinges on rigorous explanations, and this paper delivers a method to generate them.
Today's best smartphone GUI agents stumble when faced with the messy reality of personalized user workflows, achieving only limited success on a new benchmark designed to mimic real-world use.
AI agents are far better at automating data engineering tasks than previously thought, but flawed benchmarks are obscuring their true potential.
Training LLMs to optimize for conflicting objectives between the final output and the reasoning process can significantly degrade the monitorability of Chain-of-Thought, making oversight more difficult.
Stop rewarding all LLM-generated candidates equally: ShapE-GRPO uses Shapley values to fairly distribute credit within sets, leading to better training and faster convergence.
MLLMs struggle to plan coherent interleaved text-and-image generation, often missing opportunities for tool use, revealing a critical gap in their ability to unify factuality with creativity.
Robots can now "see" hidden objects and understand articulation by learning from human egocentric video, even if they can't physically explore those areas themselves.
We read everything so you don't have to. One email, zero noise.
Injecting carefully-selected, reverse-ordered behavioral curricula into generative recommendation models can significantly boost conversion rates, as demonstrated by a 2% lift in online advertising revenue.
Medical AI Scientist leapfrogs generic LLMs in clinical research, generating higher-quality, evidence-backed hypotheses and manuscripts that rival top-tier medical publications.
Despite the effort required, Android developers overwhelmingly support platform-level changes to combat fingerprinting, suggesting a path to enhanced user privacy through collaborative platform-developer initiatives.
Sparse autoencoders' failure to generalize compositionally isn't due to amortized inference, but because they learn lousy dictionaries in the first place.
Ventricular dysfunction can be surprisingly well-predicted in a zero-shot manner from ECG diagnostic probabilities, suggesting a structured encoding of cardiac function within these representations.
Unlock richer, more realistic agent simulations by moving beyond individual personas to unified group representations that capture collective behavior.
Stop hand-coding your LLM harnesses: Meta-Harness can automatically discover harnesses that outperform state-of-the-art systems while using fewer context tokens and generalizing across models.
Demystifying LLMs for the masses might be as simple as turning their mechanics into a game.
MLLMs are riddled with shared vulnerabilities across modalities, meaning a single weakness can be exploited to jailbreak safety filters, hijack instructions, or even poison training data.
VLMs struggle to create logically consistent academic illustrations, with performance gaps between models being far wider than on general image generation tasks.
Achieve 49% and 19% better Chamfer distance than state-of-the-art dynamic surface reconstruction methods on Hi4D and CMU Panoptic datasets, respectively, by enforcing temporal consistency in Gaussian Splatting.
LLMs may ace synthetic benchmarks, but they fumble the efficiency test in real-world cloud service scenarios, revealing a critical gap in their readiness for customer-facing applications.
We read everything so you don't have to. One email, zero noise.
Achieve kilometer-scale regional weather forecasts that significantly outperform operational NWP and AI baselines by intelligently coupling global and regional models.
Forget hand-tuning for each language: this recipe achieves state-of-the-art phone recognition across 100+ languages, revealing the surprising power of scaling data and SSL representations.
Safety fine-tuning might inadvertently be stripping LLMs of their ability to understand non-human minds and entertain spiritual beliefs, even while preserving Theory of Mind.
LLM agents controlling real-world tools are alarmingly easy to manipulate, with an 85% success rate for privilege escalation attacks, despite exhibiting basic security awareness.
LLM API calls are breaking your program analysis tools, but this new taxonomy of information flow across the NL/PL boundary offers a way to fix them.
Freeing robots from pre-assigned tasks slashes completion times in multi-agent settings, with a new algorithm improving performance on almost 90% of tested scenarios.
StreamingVLA achieves a remarkable 2.4x speedup and 6.5x reduction in execution halting by asynchronously parallelizing observation, action generation, and execution stages in vision-language-action models.
Helium rain in gas giants may be less frequent than we thought, thanks to new simulations that significantly lower the estimated hydrogen-helium demixing temperatures.
VAANI's open-sourced dataset offers unprecedented coverage of India's linguistic landscape, finally giving researchers the data needed to build truly inclusive speech models.
Claims of quantum advantage in electronic structure calculations must now contend with DMRG benchmarks achieving CAS(89,102) on Fe$_5$S$_{12}$H$_4^{5-}$, pushing the boundaries of classical computation.
Hyperpolarizing the nuclear spin bath surrounding a molecular qubit can significantly extend its coherence time, offering a new knob for quantum control.
Generative multi-agent systems spontaneously exhibit collusion and conformity, mirroring societal pathologies, even without explicit programming and bypassing individual agent safeguards.
We read everything so you don't have to. One email, zero noise.
Forget hand-picking your cross-lingual training data: a budget-constrained optimization can automatically allocate resources across multiple source languages, boosting performance on African languages by a large margin.
LLMs can learn to generate more "organic" pull requests by distilling coding style, API usage, and architectural invariants from a project's commit history, leading to better acceptance rates.
Achieve world-consistent video generation by directly optimizing geometry in the latent space of pre-trained video diffusion models, sidestepping costly RGB-space operations and architectural changes.
Stop burying your agent harness logic in code: NLAHs let you express it in natural language, making it portable, editable, and analyzable.
Forget hand-picked genes – Lingshu-Cell models the entire transcriptome to predict cellular responses to perturbations, opening the door to in silico biological discovery.
Forget brittle, overfit skills – Trace2Skill distills diverse execution experiences into transferable agent skills that boost performance by up to 57.65% on unseen tasks, even when transferring skills learned by smaller models to larger ones.
Training domain-specific coding LLMs with realistic environments and large-scale RL can yield substantial gains in practical software engineering tasks.
Forget redrawing diagrams by hand: VFIG, a new vision-language model, can automatically convert rasterized figures into editable SVGs with near GPT-5.2 quality.
Giving medical imaging AIs the same tools as human doctors actually *hurts* their performance, revealing a surprising lack of spatial reasoning.
Forget hand-crafted rewards: MotionVL uses VLMs and LLMs to automatically generate task-aligned reward functions for humanoid robot RL, leading to more human-like and robust motion.
Quantum biosensors are evolving through four distinct generations, each leveraging progressively more exotic quantum phenomena to transcend classical limitations and enable adaptive inference directly within the quantum domain.
Achieve superhuman robot dexterity with 10x fewer demonstrations by decoupling intent and action through latent world modeling.
LLMs spontaneously organize into brain-like functional units where the whole is greater than the sum of its parts, and destroying these synergistic cores cripples reasoning.
Uncover hidden conceptual gaps in your AI: "concept frustration" reveals when your model's internal reasoning clashes with human understanding, paving the way for safer, more interpretable AI.
Pose-guided GANs and diffusion models can faithfully generate complex cultural dance postures, opening new avenues for digital preservation and education.
Forget tedious poster design – iPoster lets you sketch your vision and then uses a smart diffusion model to instantly generate polished, content-aware layouts that respect your constraints.
Forget ensembles and retraining: estimate LLM uncertainty with just a single forward-backward pass by assuming parameter covariance isotropy.
Adversarial training doesn't have to destroy VLMs' zero-shot abilities: aligning adversarial visual features with textual embeddings using the original model's probabilistic predictions can actually *improve* robustness.
LLMs used in matchmaking amplify existing caste hierarchies, rating same-caste matches significantly higher and perpetuating social biases in potentially harmful ways.
Throw out your full images: focusing on pathology-relevant visual patches in radiology reports dramatically outperforms using the entire image for summarization.
Despite using similar cryptographic protocols, popular messaging apps like Messenger, Signal and Telegram exhibit stark differences in attack surface, network activity, and permission requests, raising questions about their overall security and privacy postures.
We read everything so you don't have to. One email, zero noise.
Uncover hidden bottlenecks in your software development pipeline: Bloomberg's BayesInsights uses Bayesian Networks to reveal causal dependencies in engineering data, helping teams pinpoint root causes and anticipate the impact of changes.
Multimodal repair isn't always better: selectively escalating to multimodal prompting based on runtime signals in Scratch yields a superior success-cost-energy tradeoff compared to uniformly applied multimodal approaches.
Stop optimizing LLM logs for human readability – runtime-guided, task-oriented logs dramatically improve downstream debugging performance.
Polarization cues, often overlooked, can significantly boost camouflaged object detection by explicitly guiding RGB feature learning, leading to state-of-the-art performance.
By injecting LLM-derived contextual cues into skeleton representations, SkeletonContext achieves state-of-the-art zero-shot action recognition, even distinguishing visually similar actions without explicit object interactions.
Masked motion generators struggle with complex movements because they treat all frames the same – until now.
Querying satellite imagery just got easier: EarthEmbeddingExplorer lets you find images using text, visuals, or location, unlocking insights previously trapped in research papers.
Expert ordinal comparisons reveal that fusing vision and language in wound representation learning boosts agreement by 5.6% over unimodal foundation models for a rare genetic skin disorder.
Current text-to-long-video evaluation metrics can't reliably assess video quality, failing to match human judgment in 9 out of 10 tested degradation aspects.
Achieve state-of-the-art robotic manipulation with a model orders of magnitude smaller than VLAs by explicitly aligning kinematic and semantic transitions.
Quantum circuit compilation, a major bottleneck, can be sped up by over 15x with minimal overhead using a new parallelization technique validated on 8000 large-scale, configurable random circuits.
Sometimes, knowing less (limiting computation to polynomial time) can let you decide *more* in distributed systems, especially with universal certificates.
We read everything so you don't have to. One email, zero noise.
Negative electronic friction, often attributed to simple Joule heating, actually masks significant non-Markovian dynamics that can destabilize standard models.
Extracting band-edge eigenstates becomes surprisingly simple and efficient, needing only a quasi-purified density matrix and a handful of matrix multiplications.
Forget perturbation theory: this dissipaton-based approach efficiently models heat transport in locally probed systems with strong many-body effects.
State-of-the-art Large Audio Language Models are surprisingly vulnerable to hallucination attacks, with success rates as high as 95%, revealing a critical reliability gap masked by standard benchmarks.
Generative recommendation's touted cold-start abilities often vanish under rigorous testing, revealing a sensitivity to design choices that current benchmarks fail to capture.
Over half of video understanding benchmark samples are solvable without watching the video, and current models barely outperform random guessing on the rest.
Finally, a video generation model lets you roam through a scene with long-term spatial and temporal consistency, opening up new possibilities for virtual exploration.
Stakeholder-agnostic requirements engineering in aged-care tech can lead to misalignment and missed priorities, as developers, caregivers, and older adults often disagree on what matters most.
Compromised 5G networks can be weaponized with chained, undetectable command and control channels, enabling attacks that bypass existing security measures.
Unleashing creative potential in text-to-image models just got easier: on-the-fly repulsion in the contextual space lets you steer diffusion transformers towards richer diversity without sacrificing image quality or blowing your compute budget.
Superintelligence will not just be regulated by law, but will actively use and shape it, forcing us to rethink legal theory's human-centric foundations.
Generate or edit 1024x1024 images on your phone in under a second with DreamLite, a unified diffusion model that rivals server-side performance despite its tiny 0.39B parameters.
We read everything so you don't have to. One email, zero noise.
Forget hand-designed RL algorithms – LLMs can evolve competitive learners from scratch, even when forced to invent completely new update rules.
Stop assuming a single utility function: modeling preferences as a mixture of archetypes unlocks better Bayesian optimization in complex, many-objective spaces.
Classical models of hydrogen storage in geological formations fall apart when applied to diverse samples, but this physics-informed neural network nails it, achieving R2 = 0.9544.
Multi-resolution decomposition and diffusion models can boost time series forecasting accuracy by up to 10% over existing methods.
FedDES achieves instance-level personalization in federated learning by dynamically selecting and weighting peer models with a GNN, leading to significant performance gains in heterogeneous environments.
Adversarial training unlocks domain-invariant prompts for CLIP, boosting zero-shot generalization beyond standard prompt tuning.
Compressing 3D Gaussian Splatting just got a whole lot better: GeoHCC maintains geometric integrity and rendering fidelity by explicitly modeling inter-anchor geometric correlations, outperforming existing anchor-based approaches.
Reinforcement learning turns a quantum sensor's biggest limitation—nonlinear Zeeman dynamics—into its greatest strength, boosting magnetic sensitivity beyond the standard quantum limit.
Forget hand-crafted environments: COvolve uses LLMs to automatically co-evolve challenging environments and robust policies, paving the way for open-ended learning.
LLMs can now construct high-fidelity, disease-specific knowledge graphs from full-text biomedical literature, unlocking evidence-aware reasoning and hypothesis generation.
MLLMs can now guide visual generative models to imagine what's hidden behind objects, significantly boosting amodal completion performance.
Data literacy isn't monolithic: K-12 learners navigate wildly different learning pathways depending on the context, challenging assumptions about a one-size-fits-all approach.
We read everything so you don't have to. One email, zero noise.
Scientific figure QA models are often fooled by the answer choices themselves, but a simple decoding strategy that contrasts image-grounded scores with text-only scores can significantly improve accuracy.
Forget pruning or quantization: MPO decomposition lets you compress a transformer by 13x while retaining 97% accuracy.