Search papers, labs, and topics across Lattice.
This paper introduces a provably secure steganography scheme based on list decoding to improve embedding capacity in language model-based covert communication. By maintaining a list of candidate secret messages and using a suffix-matching mechanism, the scheme more effectively utilizes the information content of generated text. Theoretical proofs for security, correctness, and capacity lower bound are provided, and experiments show a significant improvement in embedding capacity compared to existing methods while maintaining computational efficiency.
Unlock higher-capacity covert communication with LLMs: a new steganography scheme uses list decoding to substantially outperform existing methods without sacrificing security or efficiency.
Steganography embeds secret messages in seemingly innocuous carriers for covert communication under surveillance. Current Provably Secure Steganography (PSS) schemes based on language models can guarantee computational indistinguishability between the covertext and stegotext. However, achieving high embedding capacity remains a challenge for existing PSS. The inefficient entropy utilization renders them not well-suited for Large Language Models (LLMs), whose inherent low-entropy tendencies severely constrain feasible embedding capacity. To address this, we propose a provably secure steganography scheme with a theoretically proved high capacity. Our scheme is based on the concept of list decoding: it maintains a set of candidates that contain the correct secret message, instead of directly finding the correct message with more effort. This strategy fully utilizes the information content of the generated text, yielding higher capacity. To ensure the correctness of our scheme, we further introduce a suffix-matching mechanism to distinguish the correct secret message from the candidates. We provide theoretical proofs for both the security and correctness of our scheme, alongside a derivation of its theoretical capacity lower bound. Our approach is plug-and-play, requiring only a direct replacement of the model's standard random sampling module. Experiments on three LLMs and seven PSS baselines demonstrate that our method achieves computational efficiency comparable to prior PSS schemes while delivering a substantial improvement in embedding capacity.