IBM ResearchRPIMay 21, 2026arXiv:2605.22786

LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems

Sadia Asif, Mohammad Mohammadi Amiri, Momin Abbas, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy

AI Summary

This paper introduces LCGuard, a framework to mitigate sensitive information leakage in multi-agent systems that communicate via shared key-value (KV) caches. LCGuard learns representation-level transformations of KV caches before sharing them, balancing task performance with the reduction of reconstructable sensitive information. Through adversarial training, LCGuard minimizes the ability of an adversary to reconstruct agent-specific sensitive inputs from the shared KV caches, demonstrating improved safety without significant performance degradation across various models and benchmarks.

Key Contribution

Sharing key-value caches in multi-agent LLM systems leaks sensitive agent information, but LCGuard can protect it with representation-level transformations.

Abstract

Large language model (LLM)-based multi-agent systems increasingly rely on intermediate communication to coordinate complex tasks. While most existing systems communicate through natural language, recent work shows that latent communication, particularly through transformer key-value (KV) caches, can improve efficiency and preserve richer task-relevant information. However, KV caches also encode contextual inputs, intermediate reasoning states, and agent-specific information, creating an opaque channel through which sensitive content may propagate across agents without explicit textual disclosure. To address this, we introduce \textbf{LCGuard} (Latent Communication Guard), a framework for safe KV-based latent communication in multi-agent LLM systems. LCGuard treats shared KV caches as latent working memory and learns representation-level transformations before cache artifacts are transmitted across agents. We formalize representation-level sensitive information leakage operationally through reconstruction: a shared cache artifact is unsafe if an adversarial decoder can recover agent-specific sensitive inputs from it. This leads to an adversarial training formulation in which the adversary learns to reconstruct sensitive inputs, while LCGuard learns transformations that preserve task-relevant semantics and reduce reconstructable information. Empirical evaluations across multiple model families and multi-agent benchmarks show that LCGuard consistently reduces reconstruction-based leakage and attack success rates while maintaining competitive task performance compared to standard KV-sharing baselines.

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems

Related Papers