A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327
-
@cwebber Well, the maintainer's point was that this is "clean room", by which they mean Claude was not given the existing codebase as input. The counter argument is that the existing codebase almost certainly forms part of Claude's training data, so the claim of it being genuinely clean room is bogus. So to make your idea work, you'd have to use the proprietary codebase as training data, rather than prompt input.
@cwebber and I suspect that if you made an LLM based on the specific code as training data, a court would probably rule differently to how they have ruled about LLM generated code in other cases. maybe.
-
@cwebber Well, the maintainer's point was that this is "clean room", by which they mean Claude was not given the existing codebase as input. The counter argument is that the existing codebase almost certainly forms part of Claude's training data, so the claim of it being genuinely clean room is bogus. So to make your idea work, you'd have to use the proprietary codebase as training data, rather than prompt input.
-
But really, relicensing a GPL codebase to MIT is uninteresting.
Let's do the interesting one, which is: vibe code a "clean room" reimplementation of an entire proprietary codebase! After all, Microsoft released a "shared source" proprietary version of Windows. Now try seeing what happens if you run THAT through the "turn it into public domain" machine
Win-win outcome, no matter how it goes
@cwebber I cynically fear that the likely outcome is that proprietary copyright holders with lots of lawyers and money could succeed in preventing re-licensing as open source, while copyleft advocates with few resources couldn't actually prevent re-licensing to closed.
-
But really, relicensing a GPL codebase to MIT is uninteresting.
Let's do the interesting one, which is: vibe code a "clean room" reimplementation of an entire proprietary codebase! After all, Microsoft released a "shared source" proprietary version of Windows. Now try seeing what happens if you run THAT through the "turn it into public domain" machine
Win-win outcome, no matter how it goes
@cwebber I think you're going to need one hell of a kickstarter to fund that one.
-
omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078
@cwebber I'm not sure that's slop, but I won't discount the possibility...
But this part is funny in the dark humor sort of way:"...explicitly instructed Claude not to base anything on LGPL/GPL-licensed code."
So, you see, no problem...

-
@cwebber that whole relicensing and this slop reply are vomit inducing.
-
Winning option 1: yes, you can vibe code proprietary codebases into the public domain, allowing us to bootstrap proprietary codebases quickly
Winning option 2: stopping laundering of copyleft codebases
Either of these are interesting outcomes!
@cwebber I love the idea of weaponizing their reasoning in support of the working class.
Cynically though, I think there’s a third outcome: rules for thee, but not for me. In which Microsoft uses the full weight of their wallet to crush the common person, but is free to steal themselves, to profit off of the open source community. The rest of us are left to victimize each other with little legal recourse.
Is it logically consistent? Nope, but that’s the weird timeline we live in.
-
omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078
@cwebber these people don't know how to write on their own anymore lol
-
omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078
@cwebber
If he can't be bothered to write it, why should we bother to read it? -
But really, relicensing a GPL codebase to MIT is uninteresting.
Let's do the interesting one, which is: vibe code a "clean room" reimplementation of an entire proprietary codebase! After all, Microsoft released a "shared source" proprietary version of Windows. Now try seeing what happens if you run THAT through the "turn it into public domain" machine
Win-win outcome, no matter how it goes
@cwebber I think the only sticking point with this scheme is the concept of a vibe coded "clean room implementation" is problematic. Like, have you SEEN Claude's room? Is absolutely FILTHY!
-
But really, relicensing a GPL codebase to MIT is uninteresting.
Let's do the interesting one, which is: vibe code a "clean room" reimplementation of an entire proprietary codebase! After all, Microsoft released a "shared source" proprietary version of Windows. Now try seeing what happens if you run THAT through the "turn it into public domain" machine
Win-win outcome, no matter how it goes
@cwebber even funnier with *closed source* proprietary Java or C# apps (and Android, perhaps?!) as these can be decompiled to a very ugly IR code that can be somewhat usable to guide a LLM!
-
Winning option 1: yes, you can vibe code proprietary codebases into the public domain, allowing us to bootstrap proprietary codebases quickly
Winning option 2: stopping laundering of copyleft codebases
Either of these are interesting outcomes!
@cwebber What constitutes laundering of copyleft codebases?
-
@cwebber What constitutes laundering of copyleft codebases?
The way I read it in this context is that an existing codebase has license (whether GPL, LGPL, or proprietary or whatever), and that by "laundering" the codebase through an LLM, the output no longer retains the retains the license terms. In the US at least, the Supreme Court has ruled that LLM output is uncopyrightable.
So as @cwebber highlights, either the licensewashing works, in which case LLMs can scrub licenses off proprietary codebases giving a leg up on "reproducing" proprietary codebases into the public domain; or it doesn't work, in which case LLM-produced code becomes subject to the licensing of the original code.
-
A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327
@cwebber
> Their claim that it is a "complete rewrite" is irrelevant, since they had ample exposure to the originally licensed code (i.e. this is not a "clean room" implementation). Adding a fancy code generator into the mix does not somehow grant them any additional rights.
The human didn't write the code, the LLM did. "They" which had "ample exposure to the originally licensed code" does not exist; "they" are ephemeral.
1. Start a fresh session / clean context, make it meticulously document the architecture, APIs, etc
2. keep those documents, throw away the code, start a new session with an LLM that has clean context and tell it to build off those documents.
That's clean room. If the original code was not in the LLM's context, it's not violating the license.
This is how you can do this. Proving beyond a reasonable doubt he didn't do it this way is going to require a lot of evidence nobody will have. -
omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078
@cwebber how is than an "obvious slop response"? I don't see anything odd other than the "core claim" statement but I would probably have phrased it similarly -
@cwebber how is than an "obvious slop response"? I don't see anything odd other than the "core claim" statement but I would probably have phrased it similarly
@feld The headings, the emdashes, the framing of sentences, all classic AI "speech patterns" especially in markdown documents
-
@feld The headings, the emdashes, the framing of sentences, all classic AI "speech patterns" especially in markdown documents
@feld the author clearly at least was *assisted* in writing this response
-
A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327
Krass, dass sich AI-Firmen einfach Open Source Code schnappen und die Lizenzen "waschen" wollen.

Das ist genau das Problem mit dem aktuellen AI-Hype: Die großen Player denken, sie können einfach alles verwenden was im Netz steht. Und wenn's rechtlich eng wird, wird halt schnell die Lizenz geändert...
Respekt an Mark Pilgrim dass er sich dagegen wehrt! Open Source lebt von Vertrauen und klaren Regeln - nicht von solchen Manövern.
-
A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327
@cwebber Reading through all the comments there left me wondering if anyone has (yet) hooked up an LLM to be a project maintainer. Interactions via issues and just let it loose. People would be utterly mad to ever include it in their supply chain, and yet people do do mad things.
-
A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327
@cwebber Isn’t this what forks are for?