Distributional Information Embedding: A Framework for LLM Watermarking

By Haiyun He (Cornell University)

Talk Abstract: Watermarking has become a critical technique for distinguishing AI-generated text from human-authored content. In this talk, we introduce a novel theoretical framework for both zero-bit and multi-bit watermarking for Large Language Models (LLMs), aiming to jointly optimize both the watermarking scheme and the detection process. Our approach rigorously characterizes the fundamental trade-offs between watermark detectability, text quality, and information embedding rate. We establish the concept of distributional information embedding, where watermarking actively modifies the token generation process rather than embedding signals into a pre-existing text. Our theoretical analysis reveals that the maximum achievable watermarking rate is dictated by the entropy of the LLM’s output distribution and increases with higher allowable distortion.

Leveraging these insights, we propose an efficient, model-agnostic, and distribution-adaptive watermarking algorithm that optimally balances detectability and text quality while maintaining strict false alarm control. Our method employs a surrogate model along with the Gumbel-max trick to achieve superior detection performance. Empirical evaluations on LLaMA2-13B and Mistral-8×7B demonstrate the effectiveness of our approach. Additionally, we explore robustness considerations, paving the way for watermarking systems that can withstand adversarial attacks more effectively.

Speaker Bio: Haiyun He is a postdoctoral associate in the Center for Applied Mathematics at Cornell University, working with Prof. Ziv Goldfeld and Prof. Christina Lee Yu. She earned her Ph.D. in Electrical and Computer Engineering from the National University of Singapore in Sep. 2022, advised by Prof. Vincent Y. F. Tan. Her research lies at the intersection of information theory (IT) and machine learning (ML), focusing on developing fundamental theoretical analyses and effective practical solutions for ML challenges using information-theoretic tools. Her work has been published in top-tier IT and ML journals and conferences. In 2022, she was recognized as an EECS Rising Star by UT Austin.