Search papers, labs, and topics across Lattice.
Forget human-readable models: Agentic-imodels evolves ML models that are optimized for LLM interpretability, boosting agentic data science performance by up to 73%.
Users who actively participate in an AI agent's spreadsheet execution not only improve task outcomes, but also gain a deeper understanding and feel more ownership over the results.
LLMs are poised to flip the script on personalization, giving users unprecedented control over their data and how it's used across platforms.
A groundbreaking framework reduces false positives in recommendation systems by over 74%, restoring user control and transparency in content curation.
Imagine software that autonomously evolves and maintains itself – this paper lays out the architectural groundwork for making that a reality.
Iterative visual refinement lets agents navigate dense coding IDEs with superhuman precision, outperforming single-shot methods and paving the way for more reliable software engineering agents.
Autonomous web agents get a serious upgrade with WebXSkill, which lets them learn and execute skills with both code-level precision and human-readable guidance.
Don't let your SWE agent drown in context: SWE-AGILE maintains performance on multi-turn software engineering tasks by dynamically managing reasoning context with a novel sliding window and compressed reasoning digests.
Developers want AI to handle the grunt work around coding, but hands off when it comes to the creative core – revealing that the true value of AI tooling may lie in knowing where *not* to help.
Gaze-tracking unlocks a new level of personalized AI assistance, enabling LLMs to infer user cognitive states and boost recall performance.
Knowing the *perfect* API to use or *exact* location to edit could drastically improve SWE agent performance, but knowing the perfect regression test result? Not so much.
GeoAI assistants remain unproductive because they lack a crucial agency layer for iterative human-AI collaboration, a gap this paper addresses with nine core primitives.
Generative multi-agent systems spontaneously exhibit collusion and conformity, mirroring societal pathologies, even without explicit programming and bypassing individual agent safeguards.
LLM agents can slash task completion time by almost 50% simply by predicting and pre-executing likely tool calls.
AI-generated code's fluency masks a critical flaw: it often fails to deliver what users actually intend, highlighting the urgent need for "intent formalization" to bridge the gap between informal requirements and precise program behavior.
LLM agents can learn to explore novel states and generalize to new tasks with a hybrid on- and off-policy RL framework that leverages memory.
GUI agents can achieve significantly stronger task-solving capabilities through carefully designed post-training and data curation, without relying on costly online data collection.
AgentOS reimagines LLMs as reasoning kernels within a structured OS, offering a blueprint for more robust and scalable AI agents.
Forget slow, reactive GUI agents – ActionEngine uses a state-machine memory to plan actions programmatically, slashing costs by 11.8x and doubling speed while boosting task success to 95%.
Imagine a world where web agents don't just click and type, but orchestrate complex tasks with the reliability of APIs – Web Verbs offer a path to that future.
World models can now effectively simulate complex desktop software environments like Microsoft Office, enabling agents to reason about actions before execution and significantly improving performance.
Guaranteeing consistent communication between AI agents is now possible: a new certification protocol slashes disagreement by up to 96% by ensuring agents share a common understanding of terms.
Forget task-specific models: Magma, a single foundation model, now outperforms them in both UI navigation and robotic manipulation by bridging verbal and action abilities.