Marked for life

Google offers its AI watermarking tech as free open source toolkit

SynthID provides a hidden way to mark LLM output as artificial.

Kyle Orland – Oct 24, 2024 10:21 am | 49

In addition to watermarking AI-generated images, Google's open source SynthID works on text, audio, and video. Credit: Google

Back in May, Google augmented its Gemini AI model with SynthID, a toolkit that embeds AI-generated content with watermarks it says are “imperceptible to humans” but can be easily and reliably detected via an algorithm. Today, Google took that SynthID system open source, offering the same basic watermarking toolkit for free to developers and businesses.

The move gives the entire AI industry an easy, seemingly robust way to silently mark content as artificially generated, which could be useful for detecting deepfakes and other damaging AI content before it goes out in the wild. But there are still some important limitations that may prevent AI watermarking from becoming a de facto standard across the AI industry any time soon.

Spin the wheel of tokens

Google uses a version of SynthID to watermark audio, video, and images generated by its multimodal AI systems, with differing techniques that are explained briefly in this video. But in a new paper published in Nature, Google researchers go into detail on how the SynthID process embeds an unseen watermark in the text-based output of its Gemini model.

The core of the text watermarking process is a sampling algorithm inserted into an LLM’s usual token-generation loop (the loop picks the next word in a sequence based on the model’s complex set of weighted links to the words that came before it). Using a random seed generated from a key provided by Google, that sampling algorithm increases the correlational likelihood that certain tokens will be chosen in the generative process. A scoring function can then measure that average correlation across any text to determine the likelihood that the text was generated by the watermarked LLM (a threshold value can be used to give a binary yes/no answer).

This probabilistic scoring system makes SynthID’s text-based watermarks somewhat resistant to light editing or cropping of text since the same likelihood of watermarked tokens will likely persist across the untouched portion of the text. While watermarks can be detected in responses as short as three sentences, the process “works best with longer texts,” Google acknowledges in the paper, since having more words to score provides “more statistical certainty when making a decision.”

Google also notes that this kind of watermarking works best when there is a lot of “entropy” in the LLM distribution, meaning multiple valid candidates for each token (e.g., “my favorite tropical fruit is [mango, lychee, papaya, durian]”). In situations where an LLM “almost always returns the exact same response to a given prompt”—such as basic factual questions or models tuned to a lower “temperature”—the watermark is less effective.

A diagram explaining how SynthID’s text watermarking works. Credit: Google / Nature

Google says SynthID builds on previous similar AI text watermarking tools by introducing what it calls a Tournament sampling approach. During the token-generation loop, this approach runs each potential candidate token through a multi-stage, bracket-style tournament, where each round is “judged” by a different randomized watermarking function. Only the final winner of this process makes it into the eventual output.

Can they tell it’s Folgers?

Changing the token selection process of an LLM with a randomized watermarking tool could obviously have a negative effect on the quality of the generated text. But in its paper, Google shows that SynthID can be “non-distortionary” on the level of either individual tokens or short sequences of text, depending on the specific settings used for the tournament algorithm. Other settings can increase the “distortion” introduced by the watermarking tool while at the same time increasing the detectability of the watermark, Google says.

To test how any potential watermark distortions might affect the perceived quality and utility of LLM outputs, Google routed “a random fraction” of Gemini queries through the SynthID system and compared them to unwatermarked counterparts. Across 20 million total responses, users gave 0.1 percent more “thumbs up” ratings and 0.2 percent fewer “thumbs down” ratings to the watermarked responses, showing barely any human-perceptible difference across a large set of real LLM interactions.

Google’s research shows SynthID is more dependable than other AI watermarking tools, but its success rate depends heavily on length and entropy. Credit: Google / Nature

Google’s testing also showed its SynthID detection algorithm successfully detected AI-generated text significantly more often than previous watermarking schemes like Gumbel sampling. But the size of this improvement—and the total rate at which SynthID can successfully detect AI-generated text—depends heavily on the length of the text in question and the temperature setting of the model being used. SynthID was able to detect nearly 100 percent of 400-token-long AI-generated text samples from Gemma 7B-1T at a temperature of 1.0, for instance, compared to about 40 percent for 100-token samples from the same model at a 0.5 temperature.

Come on in, the watermark’s great!

In July, Google joined six other major AI companies in committing to President Biden that they would develop clear AI watermarking technology to help users detect “deepfakes” and other damaging AI-generated content. But in August, a Wall Street Journal report suggested OpenAI was reluctant to release an internal watermarking tool it had developed for ChatGPT, citing worries that even a 0.1 percent false positive rate would still lead to a large wave of false cheating accusations.

Google’s open-sourcing of its own AI watermarking technology takes it in the opposite direction of OpenAI, giving the wider AI community a convenient way to simply implement watermarking technology in its outputs. “Now, other AI developers will be able to use this technology to help them detect whether text outputs have come from their own [large language models], making it easier for more developers to build AI responsibly,” Google DeepMind VP of Research Pushmeet Kohli told the MIT Technology Review.

Convincing major LLM makers to implement watermarking technology could be important because, without watermarking, “post hoc” AI detectors have proven to be extremely unreliable in real-world scenarios. But even with watermarking toolkits widely available to model makers, users hoping to avoid detection will likely be able to make use of open source models that could be altered to turn off any watermarking features.

Still, if we’re going to prevent the Internet from becoming filled with AI-generated spam, we’ll need to do something to help users identify that content. Pushing toward AI watermarking as an industry standard, as Google seems to be with this open source release, feels like it’s at least worth a try.

Kyle Orland Senior Gaming Editor

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

49 Comments