Search papers, labs, and topics across Lattice.
47 papers published across 3 labs.
LLM agents can achieve near-perfect memory recall without prohibitive costs by strategically combining fast, lossy retrieval with slower, exhaustive deliberation.
Forget static model averaging: dynamically weighting ensembles based on empirical performance can significantly boost accuracy and interpretability in financial loan default prediction.
Orthogonal constraints can rescue sparse embeddings in recommender systems from representation collapse, unlocking significant performance gains in large-scale industrial deployments.
LLM-generated survey responses can be statistically accurate yet still miss the option most preferred by humans, highlighting a critical flaw in current evaluation methods.
Automating web data integration for expert querying is now possible: SODIUM-Agent achieves a 2x accuracy boost over existing systems on a new benchmark of 105 real-world tasks.
LLM agents can achieve near-perfect memory recall without prohibitive costs by strategically combining fast, lossy retrieval with slower, exhaustive deliberation.
Forget static model averaging: dynamically weighting ensembles based on empirical performance can significantly boost accuracy and interpretability in financial loan default prediction.
Orthogonal constraints can rescue sparse embeddings in recommender systems from representation collapse, unlocking significant performance gains in large-scale industrial deployments.
LLM-generated survey responses can be statistically accurate yet still miss the option most preferred by humans, highlighting a critical flaw in current evaluation methods.
Automating web data integration for expert querying is now possible: SODIUM-Agent achieves a 2x accuracy boost over existing systems on a new benchmark of 105 real-world tasks.
Hypergraph modeling of patient visits, coupled with contrastive pre-training, significantly boosts medication recommendation accuracy and safety by capturing complex relationships missed by traditional graph-based approaches.
Decentralized competitive allocation provably beats simpler baselines in modular systems with endogenous costs, finally justifying its use with rigorous regret bounds.
Semiparametric bandits can achieve $\tilde{O}(\sqrt{T})$ regret while retaining interpretability, thanks to a novel kernelized ε-greedy algorithm and Stein-based estimation.
Citation-grounded supervised fine-tuning slashes hallucination rates to zero in encoder-decoder models, proving that explicit citation mechanisms are a potent tool for factual accuracy in dialogue systems.
RAG systems can achieve state-of-the-art performance by explicitly preserving document topology, outperforming LLM-based chunking while simultaneously reducing token overhead.
Forget expensive deep-sea expeditions: GEAR finds structurally similar terrestrial environments with surprising accuracy, opening new avenues for biological research.
Turns out, users care more about late-session swipe delays than early ones when binging short videos.
Learning from ranked preferences alone can be surprisingly difficult: even with access to the full ranking of actions, standard online learning guarantees break down unless the environment is sufficiently stable.
Greedy off-policy learning, optimal in theory, can fail spectacularly when supplies are limited, but a simple fix—prioritizing items with high *relative* reward—can restore performance.
Legally mandated data deletion requests can be weaponized to stealthily cripple GNN performance, even if the model appears robust during initial training.
Escape the scripted feel of simulated conversations: Interplay trains independent user and recommender LLMs that interact in real-time, without pre-defined target items, for more realistic and diverse conversational recommendation data.
Current benchmarks fail to rigorously evaluate deep research agents, but a new framework leveraging structured knowledge bases and synthetic data offers a verifiable and scalable solution.
Stop prompt injections cold: PCFI's priority-aware runtime defense intercepts all attacks in testing with zero false positives and negligible overhead.
Stop retrieving background noise: HCQR refines RAG by generating targeted queries that seek evidence to directly support or refute candidate answers.
Imagine a single algorithm that dominates in both predictable and chaotic ranking scenarios – this paper delivers it for multi-dueling bandits.
Multilingual question answering is harder than you think: even state-of-the-art RAG systems stumble when dealing with questions and knowledge in multiple languages.
LLMs exhibit consistent and detectable geographic preferences for brands and cultures, revealing potential biases in market intermediation that persist across user personas.
Spotify's GLIDE model proves that generative LLMs can drive significant gains in podcast discovery and non-habitual listening in a real-world, production environment.
Ditch static embeddings: Generative retrieval, powered by reinforcement learning, lets models dynamically reason about relevance, outperforming larger contrastively-trained models on reasoning-intensive tasks.
Finding a hidden node in a graph just got a whole lot faster: a new algorithm slashes the average search cost with provable approximation guarantees, even with non-uniform query costs.
Naive fine-tuning of VLMs for multimodal sequential recommendation causes catastrophic modality collapse, but can be fixed with gradient rebalancing and cross-modal regularization.
Stop training LLMs to assign arbitrary scores to papers in isolation; comparison-based ranking unlocks significantly better generalization and accuracy in paper evaluation.
Existing citation recommendation benchmarks overestimate real-world performance because they fail to account for the temporal constraints of recommending citations for *new* papers.
Forget tool-augmented systems: NEO shows you can consolidate search, recommendation, and reasoning into a single language-steerable LLM by representing items as SIDs and interleaving them with natural language.
Federated recommendation systems can now better adapt to evolving user preferences without sacrificing privacy, thanks to a novel approach that retains historical knowledge and transfers insights between similar users.
Semantic sorting in LLMs can be twice as fast with no loss in accuracy by strategically combining listwise ranking algorithms.
LLMs forget up to 60% of facts when summarizing and erode over half of project constraints during iterative compaction, but a simple discrete memory system (KOs) fixes this while slashing costs by 252x.
Agentic LLMs are surprisingly vulnerable: a new framework finds successful attacks in 84% of attempts by escalating prompt injection techniques across multiple stages.
Seemingly sophisticated dense retrieval methods can catastrophically fail at contradiction detection due to "Semantic Collapse," highlighting the surprising effectiveness of a simple, decoupled lexical approach for reliable biomedical QA.
LLMs can be systematically shifted from stochastic pattern-matchers to verified truth-seekers using a carefully orchestrated, multi-stage retrieval and verification pipeline.
RAG systems can now achieve 8x better PII leakage protection without sacrificing utility or speed, thanks to a novel "Verify-then-Route" paradigm.
"Superspreader" networks on Twitter amplify contrarian scientific viewpoints, influencing news media coverage and potentially distorting public understanding of science.
LLM-powered recommendation agents, despite their reasoning prowess, are easily manipulated by contextual biases in high-stakes scenarios like paper review and job recruitment.
LLMs armed with RAG can reconstruct cyberattacks with high precision and recall, but the best model for the job depends on your budget: DeepSeek V3 matches Claude Sonnet 4's accuracy at 1/15th the cost.
Forget chasing leaderboard hype: this study reveals that larger embedding models and strategic concatenation are key to unlocking LLM-powered tabular prediction, regardless of public rankings.
No training needed: ARAM dynamically adjusts retrieved context guidance in masked diffusion models based on signal quality, resolving retrieval-prior conflicts on the fly.
Retrieval-augmented LLM agents can learn to learn from experience, achieving significantly better generalization on unseen tasks by combining the strengths of fine-tuning and in-context retrieval.
Discover emergent narratives in real-time without predefined labels, revealing how information evolves during crises.
Stop chasing leaderboard gains on generic benchmarks: PJB reveals that domain-specific weaknesses in person-job retrieval far outweigh the benefits of general model upgrades, and that query understanding modules can actually hurt performance.
LLMs can now recommend drugs with state-of-the-art accuracy by synthesizing individual patient context with the prescribing tendencies of similar cases, outperforming guideline-based and similar-patient retrieval methods.
Forget subjective scouting reports: this framework objectively identifies undervalued football players by blending market dynamics with news sentiment, offering a data-driven edge in talent acquisition.
Forget specialized tools: a standard Unix terminal and clever RL are all you need to beat much larger LLMs at code search.