Search papers, labs, and topics across Lattice.
14 papers from Amazon Science on Natural Language Processing
LLM-generated survey responses can be statistically accurate yet still miss the option most preferred by humans, highlighting a critical flaw in current evaluation methods.
Forget expensive multilingual annotations: this framework lets you evaluate LLMs in new languages by transferring knowledge from English, with surprisingly strong results.
LoRA fine-tuning can significantly boost the voice cloning capabilities of LLM-based TTS systems, but only if the training data is acoustically diverse enough.
LLM-based recommender systems can trigger users' personal trauma, phobias, or self-harm history, but a new framework cuts these safety violations by 96.5% while maintaining recommendation quality.
Latent reasoning models often take shortcuts to achieve high accuracy, and stronger supervision, while mitigating this, paradoxically restricts the diversity of their latent representations.
Forget prompt engineering and fine-tuning: this "Reasoning Inception" method injects targeted reasoning into LLM agents at test time to fix conversational errors on the fly.
Forget costly knowledge graphs: SAGE offers a lightweight, chunk-level graph expansion method that boosts retrieval recall by up to 8.5 points on heterogeneous QA tasks.
An end-to-end system extracts funny scenes from movies with 87% accuracy, opening new avenues for automated content repurposing.
Give new e-commerce products a warm start by borrowing behavioral signals from their substitutes, boosting search relevance and product discovery.
Duality between nodes and hyperedges unlocks unsupervised learning on heterophilic hypergraphs, outperforming supervised methods without needing negative samples.
General-purpose Causal Foundation Models can now match the performance of specialized causal models by incorporating partial causal graph information via attention bias, unlocking a more unified approach to causal inference.
MLLMs can now reason about road traffic accidents by fusing remote sensing imagery and structured data, unlocking interpretable insights previously inaccessible to traditional methods.
LLMs evaluating job candidates exhibit significant bias against hedging language, docking candidates by 25.6% on average, even when the content is equivalent.
By focusing on the most challenging examples, CRPO significantly boosts machine translation accuracy and data efficiency compared to standard preference optimization techniques.