I don't think that provable watermarking is possible in practice. The method you mention is clever, but before it can work, you would need to know the probability of the every other source which could also be used to generate the output for the same purpose. If you can claim that the probability of that model is much higher on that model than in any other place, including humans, then watermark might give some stronger indications.
You would also need to define probability graph based on the output length. The longer the output, more certain you can be. What is the smallest amount of tokens that cannot be proved at all?
You would also need include humans. Can you define that for human? All LLMs should use the same system uniformally.
Otherwise, "watermaking" is doomed to be misused and not being reliable enough. False accusations will be take a place.
I agree. I'd add that not only could human-written content fail the test -- it's also the case that humans will detect the word pairing, just as they detected "delve" and various other LLM tells.
In time most forms of watermarking along those lines will seem like elements of an LLM's writing style, and will quickly be edited out by savvy users.
You would also need to define probability graph based on the output length. The longer the output, more certain you can be. What is the smallest amount of tokens that cannot be proved at all?
You would also need include humans. Can you define that for human? All LLMs should use the same system uniformally.
Otherwise, "watermaking" is doomed to be misused and not being reliable enough. False accusations will be take a place.