Core AI

AI Watermarking

Last reviewed: April 2026

Techniques for embedding hidden, detectable signals in AI-generated text or images that allow the content to be identified as machine-generated, even after modification.

AI watermarking is a technique for embedding imperceptible signals in AI-generated content — text, images, audio, or video — that can later be detected to verify the content was produced by an AI system. As AI-generated content becomes increasingly indistinguishable from human-created content, watermarking provides a technical means of maintaining provenance and trust.

How text watermarking works

Text watermarking modifies the model's token selection process in a subtle, detectable way:

At each generation step, the model's vocabulary is secretly divided into "green list" and "red list" tokens using a hash function.
The model is biased towards selecting green list tokens — slightly increasing their probability and decreasing red list tokens' probability.
The resulting text reads naturally to humans but contains a statistically detectable pattern.
A detector analyses the text and checks whether green list tokens appear more frequently than expected by chance.

The watermark is invisible to readers but statistically significant to the detection algorithm.

How image watermarking works

For images, watermarking typically involves:

Pixel-level perturbations: Tiny changes to pixel values that are invisible to the human eye but form a detectable pattern.
Latent space watermarking: Embedding the watermark in the model's internal representation space before the image is generated, making it more robust to post-processing.
Frequency domain watermarking: Embedding signals in the image's frequency components, which survive cropping, resizing, and compression.

Why watermarking matters

Misinformation defence: Identifying AI-generated content helps combat deepfakes and synthetic misinformation.
Academic integrity: Detecting AI-generated text in academic submissions.
Content attribution: Establishing whether content was human-created or AI-generated for legal and editorial purposes.
Regulatory compliance: Emerging regulations (like the EU AI Act) may require AI-generated content to be identifiable.
Trust: Users have a right to know when they are consuming AI-generated content.

Challenges and limitations

Robustness: Text watermarks can be removed by paraphrasing, translation, or rewriting. Image watermarks can be degraded by heavy compression, cropping, or re-generation.
Quality impact: Any watermarking technique slightly constrains the model's output, potentially reducing quality. The trade-off between watermark strength and output quality must be carefully managed.
False positives: No detection system is perfect. Human-written text can occasionally trigger watermark detectors, with serious consequences for falsely accused individuals.
Adversarial removal: Determined actors can develop techniques to remove or obscure watermarks.
Open-source models: Watermarking only works when the model provider implements it. Open-source models can be run without watermarking.

Current state of deployment

Google DeepMind SynthID: Watermarks both text and images generated by Google's AI models.
OpenAI: Has researched text watermarking but has been cautious about deployment due to quality and adoption concerns.
Metadata-based approaches: Some providers embed provenance information in image metadata (C2PA standard), though this is easily stripped.

The broader context

Watermarking is one tool in a larger toolkit for managing AI-generated content. It works best alongside other approaches: content provenance standards (C2PA), AI detection models, media literacy education, and platform policies. No single approach is sufficient on its own.

Want to go deeper?

This topic is covered in our Advanced level. Access all 100+ lessons free.

Why This Matters

As AI-generated content becomes ubiquitous, watermarking represents one of the most practical approaches to maintaining trust and provenance. Understanding its capabilities and limitations helps you evaluate content authenticity tools and prepare for regulations that may require AI content identification.

Related Terms

Watermarking (AI)

Techniques for embedding invisible markers in AI-generated content that allow detection of whether text, images, or audio were created by AI.

AI Safety

The field of research and practice dedicated to ensuring AI systems behave as intended and do not cause unintended harm.

AI Governance

The policies, processes, and frameworks that guide how an organisation develops, deploys, and manages AI systems — covering risk, ethics, compliance, and accountability.

Generative AI

AI that creates new content — text, images, code, audio, video — rather than just analysing or classifying existing data.

Learn More

Continue learning in Advanced

This topic is covered in our lesson: AI Safety and Responsible Deployment

← Back to Glossary