Search papers, labs, and topics across Lattice.
16 papers from Google Research on Natural Language Processing
Safety fine-tuning might inadvertently be stripping LLMs of their ability to understand non-human minds and entertain spiritual beliefs, even while preserving Theory of Mind.
Despite the effort required, Android developers overwhelmingly support platform-level changes to combat fingerprinting, suggesting a path to enhanced user privacy through collaborative platform-developer initiatives.
ChatGPT's geographic reasoning can be surprisingly brittle, with minor syntactic changes causing significant output variations and task composition revealing unexpected distributional shifts.
LLM-powered diagnostic AI is ready for prime time: a real-world clinical trial shows it's safe, patients love it, and doctors find it useful.
Forget catastrophic forgetting: this function-preserving expansion method lets you fine-tune without sacrificing pre-trained knowledge, matching full fine-tuning performance at a fraction of the cost.
You can accurately predict the NDCG of a 1B-parameter reranking model by only training models up to 400M parameters, unlocking massive compute savings.
DARKFormer closes the performance gap with exact softmax attention in finetuning by learning a data-aligned kernel geometry for efficient random feature approximation, sidestepping the need for retraining or large feature budgets.
Despite dedicated efforts from multiple teams, existing speech systems still fall significantly short of deployment readiness for understanding real-world medical conversations in Indian languages, highlighting the need for further research.
Finally, a framework to quantify AI's cultural intelligence, moving beyond ad-hoc cultural benchmarks to a systematic, extensible, and theoretically grounded approach.
Recurrent models can now achieve Transformer-competitive performance on recall-intensive tasks, thanks to a simple memory caching mechanism that grows memory capacity with sequence length.
Forget rigid templates: RL-optimized verbalization of user logs boosts LLM-based recommendation accuracy by up to 93%.
Randomly masking parameter updates in RMSProp delivers state-of-the-art LLM training performance, revealing a surprisingly effective form of geometric regularization.
LLMs like GPT-5 and Gemini-3 already "know" almost everything (95-98% factual encoding), but struggle to recall it, suggesting that future gains in factuality depend more on better memory retrieval than on simply scaling up.
Finally, a streaming ASR model matches Whisper's offline transcription quality while maintaining sub-second latency.
Clinicians using a new medical literature mining LLM, LEADS, achieved 0.81 recall vs. 0.78 without it, while saving 20.8% of their time.
Clinicians using a medical literature-specific foundation model, LEADS, achieved 23-27% time savings and improved accuracy/recall compared to working alone.