Search papers, labs, and topics across Lattice.
100 papers published across 8 labs.
Learn user preferences across thousands of items from just tens of node evaluations by exploiting graph smoothness in a new spectral bandit framework.
Forget one-size-fits-all recommendations: this model uses normalizing flows to capture the *multimodal* nature of individual user preferences, leading to better cold-start performance in cross-domain recommendation.
Tree-based RAG gets a major upgrade: $\Psi$-RAG's adaptive hierarchical index and multi-granular retrieval agent leapfrog existing methods on complex, cross-document reasoning tasks.
LLMs can now generate 70% syntactically correct and geometrically consistent 3D objects from text, thanks to retrieval-augmented code synthesis.
Token-aware clustering and hierarchical indexing can slash retrieval latency by an order of magnitude without sacrificing accuracy, making multivector retrieval practical at scale.
Tree-based RAG gets a major upgrade: $\Psi$-RAG's adaptive hierarchical index and multi-granular retrieval agent leapfrog existing methods on complex, cross-document reasoning tasks.
LLMs can now generate 70% syntactically correct and geometrically consistent 3D objects from text, thanks to retrieval-augmented code synthesis.
Token-aware clustering and hierarchical indexing can slash retrieval latency by an order of magnitude without sacrificing accuracy, making multivector retrieval practical at scale.
Stop wasting compute on uninformative node types: TypeBandit intelligently allocates sampling resources in heterogeneous graphs, boosting attribute completion accuracy without architectural changes.
For AI agents needing reliable facts and stateful computation, *how* you structure memory beats simply scaling retrieval or model size.
Google's AI Overviews favor Google-owned content and penalize sites blocking its AI crawler, raising serious questions about fairness and bias in the emerging generative search landscape.
LLMs can generate recommendations up to 3.1x faster by explicitly modeling token position within items and speculation depth during speculative decoding.
Despite its simplicity, mean pooling works surprisingly well because modern text encoders concentrate token embeddings, preserving crucial information about their distribution.
LLMs can learn to safely leverage external memory for code debugging by explicitly modeling and penalizing the risk of false-positive memory injection.
Uncovered: news consumption rhythms follow a predictable hierarchy, from daily cycles to split-second actions, but historical interests still dominate user behavior.
YouTube's recommendation algorithm pushes Kyrgyz children towards Russian-language content, even when they signal a preference for their native tongue, effectively amplifying colonial influence.
Server-side tracking thought it could hide, but this new browser extension spots Google Analytics even when it's sneakily relaying data through custom endpoints.
YouTube's recommendation algorithm doesn't just show different political content to male and female-coded profiles, it steers them into structurally different information ecosystems.
The standard "human-likeness" test for user simulators is essentially useless for predicting whether they produce valid system rankings.
Stop drowning your MLLMs in irrelevant document noise: FES-RAG shows that carefully selecting multimodal fragments as evidence boosts performance by up to 27% while shrinking context length.
Iteratively exploring a corpus graph during reranking can substantially boost reasoning-intensive retrieval performance, even with weaker rerankers, offering a surprisingly effective alternative to computationally expensive retriever improvements.
Stop retrieving passages in your RAG system: NuggetIndex shows that retrieving and filtering atomic "nuggets" of information yields substantial gains in recall, temporal correctness, and reduced conflicts.
AI research agents can now reliably trace method evolution topologies thanks to a new methodological evolution graph, Intern-Atlas, that captures structured relationships between research methods.
LLMs' ranking instability, where shuffling candidates changes recommendations, can be solved with a novel architecture that enforces permutation invariance.
Domain-adapting LLMs for EDA requires explicit RAG scenario training to prevent performance degradation, and QA augmentation during corpus construction further boosts performance.
Achieve state-of-the-art multimodal stance detection by having multiple AI agents debate each other, complete with retrieval-augmented context and self-reflection.
Stop wasting tokens and context window space: OBJECTGRAPH reimagines documents as knowledge graphs, slashing token usage by up to 95% without sacrificing task accuracy.
Retrieval improvements don't always boost reasoning in RAG systems, but NeocorRAG's evidence chains can fix that, achieving SOTA with 20% fewer tokens.
Ditching text chunks for full document page images in medical RAG boosts QA accuracy by a full percentage point, proving that visual context matters.
Today's AI agents aren't really "remembering" – they're just taking notes, which means they'll hit a wall on complex tasks and can be easily brainwashed.
A single, optimized text snippet can fool CLIP into thinking it's a good caption for almost any image, revealing a surprising vulnerability in cross-modal understanding.
LLMs can achieve better zero-shot product ranking with 57% less token usage by reasoning over structured attribute graphs instead of raw text.
Explicitly diagnosing what's missing from a retrieval set unlocks substantial gains in long-term conversational memory, boosting accuracy on temporal and multi-hop questions by up to 20% while simultaneously reducing latency.
LLM-powered query reformulation, a hot topic in IR, often fails to translate gains from lexical to neural retrieval, and bigger models don't always help.
LLMs, when carefully constrained and augmented with retrieval, can slash incident triage times from hours to minutes in real-world security operations.
Mitigating long-tail distributions in code datasets boosts API recommendation reliability by up to 10% using an ensemble of models that strategically reject low-confidence predictions.
By explicitly modeling relationships between multiple relevant video segments, ClipTBP significantly improves video moment retrieval, especially when queries are ambiguous.
Zero-shot classification accuracy hinges more on the *definition* of a category than the model architecture itself.
Snapchat's new trend detection system proves that LLMs can successfully consolidate multimodal signals at scale to surface emerging topics from short-form video, boosting content freshness and user engagement.
Injecting review semantics into collaborative filtering via adaptive gating and contrastive learning substantially boosts top-N recommendation accuracy, outperforming existing review-aware methods.
Skewed item distributions in recommendation systems can be tamed with a learnable non-uniform quantization, leading to better codebook utilization and more accurate generative recommendations.
LLM-derived user profiles can be powerfully leveraged for recommendation via a surprisingly simple distribution shaping approach, outperforming more complex fusion methods.
Even state-of-the-art models like Gemini and Claude can completely miss critical user information when it's buried in semantically unrelated past interactions, tanking personalization performance.
Forget term expansion: leveraging retrieved queries and LLMs to generate query variants boosts Query Performance Prediction by up to 30% on neural rankers.
Forget static graphs: TimeMM dynamically reweights user-item interactions based on recency and modality, adapting to evolving user preferences in multimodal recommendations.
Stop blindly applying differential privacy: targeting stereotypical user data and using meta-learning can dramatically improve the accuracy of privacy-preserving recommender systems.
LLMs can achieve a 7.5x performance boost in web search and extraction by using a bi-level multi-agent architecture with iterative refinement and shared memory.
The secret to better bandit-based recommendations isn't always the bandit algorithm itself, but the way you represent user state.
LLMs can model user preferences more effectively by disentangling intent into multiple latent factors, leading to improved recommendation accuracy and interpretability.
Forget synthetic QA datasets – AgentSim offers verifiable, step-by-step RAG traces, revealing how LLMs *actually* reason over documents.
DNNs in recommendation models don't just learn feature interactions, they fundamentally reshape embedding spaces by preventing dimensional collapse.
Forget slow reranking: this new method compresses documents into embeddings, letting an 8B parameter model run up to 18x faster than smaller models with better accuracy.
Document AI pipelines don't work the way you think: quality bottlenecks aren't where you expect, and components don't cascade quality.
LinkedIn's new memory system for hiring agents boosts accuracy and speed by over 10%, proving hierarchical semantic memory is a game-changer for real-world LLM applications.
LLM agents can now remember far more, far more accurately, by "seeing" their past experiences instead of just reading about them.
Non-linear scoring with Hypencoders boosts retrieval performance, but don't expect it to fix your speed or adversarial robustness problems.
Gemini 2.5 Pro shines at question interpretation within a cascaded pipeline, but struggles to generate answers and identify evidence as effectively.
Untangling task-solving skills from factual knowledge in PRAG adapters makes them play better together, boosting performance when you combine multiple documents.
Injecting knowledge at the *right* moment during reasoning boosts accuracy by 10% while cutting retrieval calls in half, blowing away static RAG strategies.
Privacy-preserving RAG is now practical: PRAG achieves competitive recall and low latency while fully encrypting both documents and queries.
Forget retraining: model editing and constrained decoding can keep service recommendations fresh and valid in ever-changing software ecosystems.
Learning in multi-armed bandits gets a boost: even with only probabilistic side observations of other arms' losses, near-optimal regret is achievable without knowing the observation probability.
Fine-tuning language models with a graph-guided loss that captures global semantic relationships can significantly boost classification accuracy and convergence speed.
Learn user preferences across thousands of items from just tens of node evaluations by exploiting graph smoothness in a new spectral bandit framework.
Decoupling retrieval and reranking with a discrete diffusion model leaps ahead of monolithic embedding scorers for multi-modal knowledge graph completion.
Current multilingual RAG systems can miss culturally relevant answers, but CORAL's adaptive retrieval loop closes the gap, boosting accuracy by up to 3.58% on low-resource languages.
LLMs exhibit surprising dialect-dependent biases when making recommendations, favoring certain cuisines and product categories based on the linguistic style of the prompt.
Text-to-SQL models can now achieve significantly higher accuracy by grouping and ranking SQL candidates based on execution results, then strategically resampling when the initial pool is lacking.
RAG models struggle to ignore their pre-trained knowledge, even when it contradicts the provided context, but a new dataset can help them learn to be more faithful.
Forget searching through endless legal documents – a new RAG system achieves 87% faithfulness and 84% relevancy in answering complex, multi-jurisdictional AI regulation questions.
CroSearch-R1 reveals that integrating cross-lingual knowledge through a dynamic retrieval strategy can substantially enhance the performance of Retrieval-Augmented Generation systems.
MLLMs can now automatically identify and rank UI usability issues from screen recordings, offering actionable recommendations with minimal context.
Web-scale reverse image search, combined with a clever filtering mechanism, significantly boosts the accuracy of image geolocalization, even when reference databases lack relevant scenes.
ChatGPT extracts more value from each cited source than Google or Perplexity, suggesting that citation *quality* trumps citation *quantity* in generative search.
LLMs struggle with e-commerce search relevance not because of reasoning limitations, but because they lack domain-specific knowledge, a problem K-CARE solves with external knowledge grounding.
Bridging the gap between generative retrieval and ranking, RecoChain achieves superior Top-K recommendation performance without sacrificing generative strength.
Reranking in recommender systems can be revolutionized by shifting from local indices to generating global identifiers, enhancing robustness and user satisfaction.
UnIte reveals that incorporating uncertainty into document sampling can lead to substantial improvements in retrieval performance with fewer training samples.
LMMs struggle to ground text queries in the right parts of images, but explicitly modeling salient visual subjects can dramatically improve cross-modal retrieval.
LLMs can systematically generate effective hardware design heuristics, achieving an 11% reduction in scheduling latency with minimal overhead.
RecFlash slashes recommendation inference latency by up to 81% and energy consumption by nearly 92% through smart data remapping in NAND flash memory.
Retention models can now harness the power of post-conversion content without risking feature leakage, leading to more accurate predictions of user engagement.
Even basic TF-IDF methods can rival LLM-based approaches in creating navigable text structures, as shown by a new metric for evaluating hypergraphs.
A2Gen transforms short video recommendations by treating user actions as dynamic sequences, resulting in substantial improvements in user engagement metrics.
Semantic search across hundreds of millions of clinical notes is not just feasible, but can slash chart review times by up to 89% while maintaining accuracy.
ManifoldRank reveals that treating fairness as a taxation cost can significantly enhance the effectiveness of online fair re-ranking algorithms.
Misapplying the Wilcoxon test in IR research could lead to a false sense of security, resulting in misleading outcomes that undermine the validity of findings.
Forget one-size-fits-all recommendations: this model uses normalizing flows to capture the *multimodal* nature of individual user preferences, leading to better cold-start performance in cross-domain recommendation.
Unleash your AI agent's business acumen: this framework lets AI not just analyze experiments, but actively ideate, personalize, and optimize business strategies within a safe, unified software interface.
Seemingly innocuous choices in table serialization format (CSV vs. HTML) can drastically alter retrieval performance, but a simple centroid-based correction can restore semantic consistency.
See where your citations are coming from with a single command, thanks to CiteRadar's open-source platform that automatically generates interactive maps and detailed researcher profiles from your Google Scholar ID.
Dependency-controlled context and explicit evidence sufficiency criteria are key to preventing premature stopping and improving the consistency of enterprise research outputs.
LLMs re-rank documents better when you learn to route each query to the specific attention heads that matter, instead of relying on static subsets or everything at once.
Explicitly enumerating skills in-context doesn't scale for agentic LLMs, but retrieving skills on demand can substantially improve performance – if the LLM can figure out when and which skill to load.
Asymptotically shorter secret keys in Information-Theoretic Distributed Point Functions are now possible, thanks to a novel construction leveraging private information retrieval.
LLMs can bootstrap their understanding of private APIs by autonomously learning from their own coding attempts, outperforming retrieval-augmented generation by 16% on code generation tasks.
6G-enabled Internet of Everything promises a unified intelligent ecosystem, but faces critical scalability, security, and privacy challenges that demand innovative research.
Sequence recommendation models can achieve near-perfect scaling efficiency in distributed training, slashing wasted GPU cycles by up to 90%.
GraphRAG's black-box reasoning gets a spotlight: XGRAG reveals how specific knowledge graph components influence LLM outputs, boosting explanation quality by 14.81% over standard RAG explainability methods.
Sub-linear attention is now possible without sacrificing complete long-range dependency retention, thanks to learnable summary tokens that compress context.
Generative recommendation gets a boost: modeling behavior intensity and transitions yields 15-23% gains in prediction accuracy.
Storing user interaction histories in a normalized, immutable tier and reconstructing sequences just-in-time slashes data infrastructure costs and unlocks the potential of ultra-long sequence DLRMs.
Self-supervised vision models that ace linear probing can still flop at semantic image retrieval because of skewed latent space geometry that breaks approximate nearest neighbor search.
LLMs can denoise sequential recommendations by disagreeing with the recommendation model itself, leading to more robust performance against noisy user data.
Semantic grounding, not token probability, is the key to better multimodal RAG.