Google Research

×Natural Language Processing

13 papers from Google Research on Natural Language Processing

May 5, 2026

2w ago·also Google Research, Harvard, Northeastern, Notre Dame +2

Deco: Extending Personal Physical Objects into Pervasive AI Companion through a Dual-Embodiment Framework

Instead of creating new AI companions from scratch, Deco shows how to breathe new life into cherished physical objects by giving them a digital voice and personality powered by LLMs.

Zhihan Jiang, Meng Wu, Ruishi Zou +14

Natural Language Processing Robotics & Embodied AI Tool Use & Agents

Apr 28, 2026

Children's Hospital of Philadelphia3w ago·also Google Research, UPenn

Health System Scale Semantic Search Across Unstructured Clinical Notes

Semantic search across hundreds of millions of clinical notes is not just feasible, but can slash chart review times by up to 89% while maintaining accuracy.

Faith Wavinya Mutinda, F. Mutinda, Spandana Makeneni +21

Natural Language Processing Recommendation & Information Retrieval

Apr 21, 2026

Google ResearchApr 21, 2026·also Bar-Ilan, Cambridge

Location Not Found: Exposing Implicit Local and Global Biases in Multilingual LLMs

Multilingual LLMs exhibit a surprising "American bias," even when prompted in other languages, and instruction tuning makes it worse.

Guy Mor-Lan, Omer Goldman, Matan Eyal +5

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Natural Language Processing

Apr 13, 2026

Ludwig-Maximilians-Universität MünchenApr 13, 2026·also DeepMind, Google Research, Stanford HAI, Munich Center for Machine Learning +1

Epistemic Trust as a Mechanism for Ethics Integration: Failure Modes and Design Principles from 70 Moral Imagination Workshops

Ethics interventions in AI development often fail because practitioners don't trust them – here's a breakdown of why, and how to fix it.

Benjamin Lange, Geoff Keeling, Kyle Pedersen +4

Constitutional AI & AI Ethics Natural Language Processing Tool Use & Agents

Google ResearchApr 13, 2026

LLM-Based Automated Diagnosis Of Integration Test Failures At Google

Google developers are spending less time debugging integration tests thanks to an LLM that diagnoses failures with 90% accuracy.

Celal Ziftci, Ray Liu, Spencer Greene +1

Code Generation & Program Synthesis Natural Language Processing Tool Use & Agents

Apr 7, 2026

Apr 7, 2026·also Google Research

A Theoretical Framework for Statistical Evaluability of Generative Models

Forget KL divergence – this work shows you *can* reliably evaluate generative models with finite samples, but only if you use the right metric (IPMs with bounded test classes).

Shashaank Aiyer, Yishay Mansour, Shay Moran +1

Eval Frameworks & Benchmarks Natural Language Processing

Mar 30, 2026

Google ResearchMar 30, 2026·also Institute of Philosophy, Joint last authors., Northwestern, SFI +1

Theory of Mind and Self-Attributions of Mentality are Dissociable in LLMs

Safety fine-tuning might inadvertently be stripping LLMs of their ability to understand non-human minds and entertain spiritual beliefs, even while preserving Theory of Mind.

Junsol Kim, Winnie Street, R. Rocca +4

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Natural Language Processing

Google ResearchMar 30, 2026

Uncovering Relationships between Android Developers, User Privacy, and Developer Willingness to Reduce Fingerprinting Risks

Despite the effort required, Android developers overwhelmingly support platform-level changes to combat fingerprinting, suggesting a path to enhanced user privacy through collaborative platform-developer initiatives.

Alex Berke, Alex Berke, Güliz Seray Tuncay +4

Constitutional AI & AI Ethics Natural Language Processing

Mar 9, 2026

Google ResearchMar 9, 2026·also DeepMind, Babylon Health, Beth Israel Deaconess Medical Center, Beth Israel Lahey Health +3

A prospective clinical feasibility study of a conversational diagnostic AI in an ambulatory primary care clinic

LLM-powered diagnostic AI is ready for prime time: a real-world clinical trial shows it's safe, patients love it, and doctors find it useful.

Peter Brodeur, P. Brodeur, Jacob M. Koshy +58

Eval Frameworks & Benchmarks Natural Language Processing Tool Use & Agents

Mar 9, 2026·also Google Research

Grow, Don't Overwrite: Fine-tuning Without Forgetting

Forget catastrophic forgetting: this function-preserving expansion method lets you fine-tune without sacrificing pre-trained knowledge, matching full fine-tuning performance at a fraction of the cost.

Dyah Adila, Hanna Mazzawi, Benoit Dherin +1

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Training Efficiency & Optimization

Mar 1, 2026

Google ResearchMar 1, 2026

A Unified Framework to Quantify Cultural Intelligence of AI

Finally, a framework to quantify AI's cultural intelligence, moving beyond ad-hoc cultural benchmarks to a systematic, extensible, and theoretically grounded approach.

Sunipa Dev, Vinodkumar Prabhakaran, Rutledge Chin Feman +16

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Natural Language Processing

Feb 27, 2026

Google ResearchFeb 27, 2026

Memory Caching: RNNs with Growing Memory

Recurrent models can now achieve Transformer-competitive performance on recall-intensive tasks, thanks to a simple memory caching mechanism that grows memory capacity with sequence length.

Ali Behrouz, Zeman Li, Yuan Deng +3

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Training Efficiency & Optimization

Feb 17, 2026

Google ResearchFeb 17, 2026·also Northwestern

On Surprising Effectiveness of Masking Updates in Adaptive Optimizers

Randomly masking parameter updates in RMSProp delivers state-of-the-art LLM training performance, revealing a surprisingly effective form of geometric regularization.

Taejong Joo, Wenhan Xia, Cheolmin Kim

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Training Efficiency & Optimization