Waterfall: A New Watermarking Method to Protect Copyright in the World of LLMs

17 March 2025
Associate Professor
Computer Science
SHARE THIS ARTICLE

In December 2023, the New York Times brought a landmark lawsuit against ChatGPT maker Open AI and its biggest backer Microsoft, alleging that they had used millions of its articles without permission to train the massively popular chatbot. The case, which is ongoing, marked the first time a major American media outlet had sued an AI platform for copyright infringement. It also set a precedent for more than a dozen companies and individuals to follow suit — reflecting increasing tensions over the unauthorised use of published work to train AI technologies.

“Parties with ill intentions could potentially use ChatGPT and other large language models (LLMs), including open-sourced ones that they could run on their own computers, to plagiarise millions of articles very quickly with just the click of a few buttons. “There’s a really big problem about intellectual property (IP) protection, and other data issues such as privacy,” says Gregory Lau, a PhD student at NUS Computing’s GLOW.AI lab. Led by Associate Professor Bryan Low, the lab’s focus is on developing various types of AI techniques, including those that can be applied to LLMs.

To help address data provenance issues such as IP protection, GLOW.AI’s researchers have invented a special text watermarking method — called Waterfall, short for Watermarking Framework Applying Large Language Models — which they say performs better than existing, state-of-the-art techniques. The team described their work in a paper published last November, and made their code freely available online.

 

Taking a Different Approach

Digital watermarking — the process of embedding a code, pattern, or some other unique identifier into content such as videos, photos, and text to prove ownership — can offer “some form of assurance” against copyright infringement, says Lau, who co-led the Waterfall project together with fellow PhD student Niu Xinyuan. “Without it, you stand the risk of not being able to quickly scan through large corpus of text to detect plagiarism and prove it.”

In order to be effective, an ideal watermark should possess certain key characteristics: be robust against modifications such as paraphrasing or conversions to a different form; general enough to be applied to a wide range of formats (including normal text and code); and sufficiently scalable to support millions of users at a reasonable computational cost. Additionally, a good digital watermark should be impossible to detect without the right key or password, says Lau. “You don’t want an adversary to be able to quickly know that the text has been watermarked and try to break it.”

Existing watermarking methods, however, often fall short in one or more areas. For example, some watermarks are added by altering the text or pixels ever so slightly, while others are easily removed once they pass through an LLM’s training process.

Many methods are also model-centric, says Niu, whereby the main aim is to protect output generated by the AI platform itself, rather than the input data per se. This allows, for instance, a teacher to determine if a student penned her own essay, or relied on ChatGPT instead. “Model watermarking is typically used to differentiate between human-written text versus AI-generated text,” he explains.

But this tends to focus on the perspectives of big tech firms and the benefits to them, rather than that of the people who produce the content that’s used for training LLMs, says Low.

Instead, his team took a different approach to watermarking — a data-centric one, focusing instead on protecting the data sources themselves.

 

A Novel Approach

Waterfall consists of several novel techniques. “There are a few key innovations,” says Lau. “For a start, we’re the first to use LLM paraphrasing as a method to do text watermarking, instead of just perceiving it as a tool for plagiarism.”

In the traditional approach of text watermarking, synonyms for certain words in the original text  are generated and used to encode signals. For example, ‘big cat’ may be mapped to ‘big feline’ or ‘large cat.’ The specific combination that is used represents a watermark.

To begin, the researchers took a novel step of tapping the power of LLMs to go beyond replacing words, this time paraphrasing entire sentences and more. “An LLM can completely reorder, break, or fuse sentences while preserving semantic content,” they write in their paper.

For instance, the sentence ‘I ate the pineapple tart’ may be reworded to ‘The pineapple tart was eaten by me’ or “‘I consumed the pineapple baked good.’

LLM paraphrasing offers many advantages. “For synonym watermarking, you can only replace so many synonyms within the passage,” explains Niu. “But in our case, we have the rephrasing and reshuffling of sentences, so we end up with a lot more combinations and we can support a lot more different types of representations while conveying the same meaning. This allows us to support a lot more watermarks and data owners.”

“We’ve also added other ingredients to our watermarking,” says Lau. One is embedding the signal into every single word — a process called n-gram watermarking. “When you do this, you increase the chances of detecting whether there’s any plagiarism, as it provides some defenses against adversaries who could try to modify words to remove the watermark” he says. This may force adversaries to adjust the original text so much that it destroys the value of the IP within and defeats the purpose of even plagiarising it.

Another thing the team did was to embed “a wavy signal, which is a bit like sound waves of different frequencies,” explains Niu, who says they borrowed the concept from signal processing, a field of engineering and applied mathematics. “This helps to improve the computational efficiency of the verification process. It also ensures the watermarks of different parties don’t interfere with one another.”

It is this novel combination of techniques that make Waterfall surprisingly effective in achieving robust verifiability, says Low. When tested against five different threat scenarios and different AI models, Waterfall “performed really well,” he says. Moreover, compared with state-of-the-art text watermarking methods, Waterfall demonstrates better scalability (protecting up to billions — rather than hundreds — of users), requires lower computational cost, and is more versatile (working across different text types and languages, including different coding languages).

The team say their work offers a change in perspective. “People often think of LLMs as infringing on intellectual property rights, but they can also be used to protect IP,” says Lau. “While AI has harmful effects, it also has the potential to benefit society at large — we would like to encourage more people to consider such useful applications of AI.”

Trending Posts