If you ask AI to rewrite the entirety of an open-source program, do you still need to abide by the original license?
-
If you ask AI to rewrite the entirety of an open-source program, do you still need to abide by the original license? In philosophy, this problem is known as the Slop of Theseus
@lcamtuf The licence goes from «copyleft» to «sloppyleft».
-
@tbortels if you do not accept the license, you do not have any right to use the code. It’s "all rights reserved" then. @lcamtuf @bgalehouse @kevinr
@lcamtuf @ArneBab @kevinr @bgalehouse
"Use" isn't part of the GPL. And "all rights reserved" means normal copyright law, not "you get no rights at all".
The GPL defines "modify" and "propagate" as the activities it burdens. If I modify the code, and propagate it, i have a legal burden under the license. Otherwise, I don't.
IANAL, but I don't think reading the code and re-implementing a work-alike without incorporating the original code is "modify" - it's "replace".
I understand that's where "clean rooms" come into play, but that always felt like splitting hairs and giving copyright too much power - it's about physical books, not ideas. The farther we move from the original intent, the weaker a strong copyright stance becomes.
I think you could make an argument that reading code to understand it's interfaces, explicitly rejecting accepting any license, then implementing compatible code is well within the normal copyright definition of "fair use", or should be if we aren't all copyright lawyers. More importantly, it's healthy for Society and the art. If I can read a book under copyright and write a detailed book report, I should be able to read provided source code and do the same. To the extent that we've strayed away from that, the legal system has failed and needs correction.
-
R relay@relay.infosec.exchange shared this topic
-
@ArneBab @tbortels @lcamtuf @bgalehouse
Yeah the license applies whether you accept it or not. And whether your spec counts as a derivative work or not will depend greatly on the details of your spec
@kevinr @bgalehouse @lcamtuf @ArneBab
It explicitly does not. If I don't accept the license, normal copyright applies. You don't get to make a legally binding contract without consent, "clickwrap" bullshit aside.
And normal copyright has carve-outs like fair use.
-
@tbortels @bgalehouse @lcamtuf @kevinr Well, yes but no. The point about spec is the level of detailing taken from the original work. If you write an original novel about a wild, big monkey found in a jungle, brought to New York, who escapes and so on, the King Kong author cannot claim any rights to that, sorry. If it were different, many narratives and movies would not exist today. That is inspiration, not derivation. Of course it is fair declaring inspiration, but call it with the right name.
@lcamtuf @gisgeek @kevinr @bgalehouse
Heh. You might even say that's "fair use"...

-
@revk @ahltorp @tbortels @lcamtuf @bgalehouse @kevinr idk about AI but I've heard more than once that when people are actually implementing something as free software that is originally non free but was either leaked or is source available, they completely restrict themselves from even looking at the thing and only use what any user would know and do some reverse engineering, so I assumed it's actually legally unsafe to taint yourself with original code and let it potentially influence you
@lcamtuf @kevinr @revk @rustynail @ahltorp @bgalehouse
That's the "clean room" that keeps getting thrown around, originally used to try to legally protect free bsd derivatives. The idea was to make the "copy" argument so outlandish it was unsupportable.
It does set a standard, but I'm not sure it's a requirement. That is, reading code to create compatible code seems more of a fair use than an illicit copy. Especially of none of the original code appears in the finished work.
-
@tbortels why would execution be needed to agree? You as a third party don't need to agree to the license, but if it's an open license to have the privilege to edit/reuse the code you have to agree to do it. By default the code is closed, the license opens it up for you, if you somehow don't agree to it you can't use the code at all because it's closed by default
(completely unrelated to the AI thing. fuck AI)I'm not sure "closed" is the right word. Clearly it's not closed if you are providing it - it's right there, I can read it and even redistribute it without burden.
It's "copyrighted", not closed. You can't modify closed source because you don't have the source. The assertion being made is you can't modify GPL'd open source without accepting the license. But copyright has its own carve-outs, and I am unconvinced that writing a spec or net-new code is a modification, as opposed to regular old copyright fair use.
-
@lcamtuf @gisgeek @kevinr @bgalehouse
Heh. You might even say that's "fair use"...

@tbortels @lcamtuf @kevinr @bgalehouse
Where is the edge between inspiration and infringement? Are today's office suites infringing MS rights? Copyright says no, patents (a totally different beast) may say yes in some countries and no in others. So pay attention to what you desire for FOSS, because it could happen in many ways, including some very destructive ones. -
@lcamtuf @ArneBab @kevinr @bgalehouse
"Use" isn't part of the GPL. And "all rights reserved" means normal copyright law, not "you get no rights at all".
The GPL defines "modify" and "propagate" as the activities it burdens. If I modify the code, and propagate it, i have a legal burden under the license. Otherwise, I don't.
IANAL, but I don't think reading the code and re-implementing a work-alike without incorporating the original code is "modify" - it's "replace".
I understand that's where "clean rooms" come into play, but that always felt like splitting hairs and giving copyright too much power - it's about physical books, not ideas. The farther we move from the original intent, the weaker a strong copyright stance becomes.
I think you could make an argument that reading code to understand it's interfaces, explicitly rejecting accepting any license, then implementing compatible code is well within the normal copyright definition of "fair use", or should be if we aren't all copyright lawyers. More importantly, it's healthy for Society and the art. If I can read a book under copyright and write a detailed book report, I should be able to read provided source code and do the same. To the extent that we've strayed away from that, the legal system has failed and needs correction.
@tbortels yes, not accepting the license means regular copyrights.
But your arguments afterwards rely on rights the GPL gives you -- you only get them after you accept the license.
EDIT: because "if we aren’t allowed … under copyright" ← we aren’t. That’s the point.
As long as there’s no NDA (there isn’t for GPL), we *can* write a spec. But the one implementing it *must not* know the code.
-
I'm not sure "closed" is the right word. Clearly it's not closed if you are providing it - it's right there, I can read it and even redistribute it without burden.
It's "copyrighted", not closed. You can't modify closed source because you don't have the source. The assertion being made is you can't modify GPL'd open source without accepting the license. But copyright has its own carve-outs, and I am unconvinced that writing a spec or net-new code is a modification, as opposed to regular old copyright fair use.
@tbortels you cannot redistribute copyrighted material(?)
If you make a spec of copyrighted code that's effectively instructions on how to reproduce the code and can be used to commercially compete with the owners of the code so I doubt it could classify as fair use. -
@rustynail @ahltorp @tbortels @lcamtuf @bgalehouse @kevinr Hmm, there is another consequence to this.
If this is a derivative work, which I expect it is.
It causes issues when someone has, in fact, manually, coding an alternative to some copyright work (without reading original code, etc). As someone can suggest that it was done using AI as a derivative work. It no longer needs to actually follow the original code now to be accused of this.
Arrg!
@ahltorp @bgalehouse @revk @lcamtuf @kevinr @rustynail
AI is a weird case as you could assert - probably correctly - that the original code may be part of its training corpus. Was that training a GPL violation? It's a stretch. Was it's training a copyright violation? Or was the AI (or rather its owners) exercising their GPL license rights? Or was it fair use under regular copyright?
Who knows?
It's a hot mess is what it is.
This is all so far outside the original reckoning of "it'd be nice if the bookbinder down the street didn't profit off of my work until I had a chance to profit off of it first" that it's not surprising it's a mess.
-
@kevinr @bgalehouse @lcamtuf @ArneBab
It explicitly does not. If I don't accept the license, normal copyright applies. You don't get to make a legally binding contract without consent, "clickwrap" bullshit aside.
And normal copyright has carve-outs like fair use.
@tbortels if you start relying on fair use, you enter a gray zone: courts will take decisions on that.
You don’t want that as the basis of anything that provides income.
A lawsuit in a gray area can ruin you, even if you’re likely to win.
-
@tbortels yes, not accepting the license means regular copyrights.
But your arguments afterwards rely on rights the GPL gives you -- you only get them after you accept the license.
EDIT: because "if we aren’t allowed … under copyright" ← we aren’t. That’s the point.
As long as there’s no NDA (there isn’t for GPL), we *can* write a spec. But the one implementing it *must not* know the code.
@ArneBab @kevinr @lcamtuf @bgalehouse
Fair use isn't something the GPL grants you. That's what I'm trying to work out - set the GPL aside for a moment.
Does regular copyright fair use give me the right to look at the freely provided source code, make a mental model, and re-implement a workalike if I don't re-use the original source?
Pretend it's just me and not an AI, because that throws a whole new set of confusion into the mix.
BSD did it against regular copyright. Not sure this is all that different.
-
@tbortels you cannot redistribute copyrighted material(?)
If you make a spec of copyrighted code that's effectively instructions on how to reproduce the code and can be used to commercially compete with the owners of the code so I doubt it could classify as fair use.It's about how to reproduce the functionality - the code could be an entirely different language.
And - "commercially compete" with someone giving away code for free seems a non-concern.
-
@tbortels if you start relying on fair use, you enter a gray zone: courts will take decisions on that.
You don’t want that as the basis of anything that provides income.
A lawsuit in a gray area can ruin you, even if you’re likely to win.
@ArneBab @lcamtuf @kevinr @bgalehouse
We entered a gray zone about 8 off-ramps ago. Copyright never anticipated self-replicating code on computers and viral licenses and clean-room re-implementations and AIs.
As for income - I've lost track of the original driver, but it's GPL'd free code, no?
I like fair use. It and parody are one of the very few things keeping us out of peasants-with-pitchforks-and-torches mode. If you eliminate those carve-outs, the whole system goes down.
-
It's about how to reproduce the functionality - the code could be an entirely different language.
And - "commercially compete" with someone giving away code for free seems a non-concern.
@tbortels competition is one of the factors that go into what qualifies as fair use, so no, it is not a non-concern. And no, someone publishing their code with open access does not give it away for free wtf -
If you ask AI to rewrite the entirety of an open-source program, do you still need to abide by the original license? In philosophy, this problem is known as the Slop of Theseus
@lcamtuf When know that if you ask the CS department of the University of California, Berkeley and BSDi to rewrite the entirety of AT&T's Unix, the result does not need to abide by AT&T's original license.
BSD is prima facie a derivative work of AT&T Unix, not developed using a clean room approach, but instead carefully audited to remove all AT&T copyright and trade secret interests.
By the time Theseus' ship was ready, Linux had left the harbor.
-
@kevinr and proving that the AI was not trained on the original source will be pretty hard, because FLOSS programs with compatible licenses can legally copy code from one project into the other.
You’ll likely have to exclude all code from the project and all code that’s too similar from the training data. And then train an AI from scratch. Which would be extremely expensive.
@ArneBab @kevinr @SnoopJ @bgalehouse @lcamtuf I think it's more complicated. Consider program A licensed under GPL and program B licensed under BSD license. Code from program B can be copied into program A, but code from program A cannot be copied to program B without applying GPL to program B (changing the license). At least that's how it works as I understand it.
-
@ahltorp @bgalehouse @revk @lcamtuf @kevinr @rustynail
AI is a weird case as you could assert - probably correctly - that the original code may be part of its training corpus. Was that training a GPL violation? It's a stretch. Was it's training a copyright violation? Or was the AI (or rather its owners) exercising their GPL license rights? Or was it fair use under regular copyright?
Who knows?
It's a hot mess is what it is.
This is all so far outside the original reckoning of "it'd be nice if the bookbinder down the street didn't profit off of my work until I had a chance to profit off of it first" that it's not surprising it's a mess.
-
If you ask AI to rewrite the entirety of an open-source program, do you still need to abide by the original license? In philosophy, this problem is known as the Slop of Theseus
@lcamtuf@infosec.exchange If the AI has never seen the original code, neither in training data nor as part of a prompt, and if it is just rewriting the program based on the program's "API" (for a command line tool that would be the man page and the --help), then yes - Oracle vs Google definitely applies.
But as that one was quite narrow, I would assume that if even just the internal structure derives from the original program, it is too much. -
@ArneBab @kevinr @lcamtuf @bgalehouse
Fair use isn't something the GPL grants you. That's what I'm trying to work out - set the GPL aside for a moment.
Does regular copyright fair use give me the right to look at the freely provided source code, make a mental model, and re-implement a workalike if I don't re-use the original source?
Pretend it's just me and not an AI, because that throws a whole new set of confusion into the mix.
BSD did it against regular copyright. Not sure this is all that different.
@tbortels as far as I know, and as the article https://www.allaboutcircuits.com/news/how-compaqs-clone-computers-skirted-ibms-patents-and-gave-rise-to-eisa/ reinforces, fair use does not give you the right to re-implement the code.
Doesn’t matter whether you make a mental model as the intermediate step.
Only the clean room re-implementation gets out of that.