If you use AI-generated code, you currently cannot claim copyright on it in the US.
-
Stop whining. You and about seventy zillion terrified sheep running around here bleating about the Terrible AI monster under the bed.
@tuban_muzuru @jamie as a random viewer of this thread, you come off as utterly insufferable, which might not be what you think you come off as, and so you might want to reconsider your behavior
-
If you use AI-generated code, you currently cannot claim copyright on it in the US. If you fail to disclose/disclaim exactly which parts were not written by a human, you forfeit your copyright claim on *the entire codebase*.
This means copyright notices and even licenses folks are putting on their vibe-coded GitHub repos are unenforceable. The AI-generated code, and possibly the whole project, becomes public domain.
Source: https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf


@jamie it's the same in Germany. You can't copyright anything that isn't created by a human.
-
If you use AI-generated code, you currently cannot claim copyright on it in the US. If you fail to disclose/disclaim exactly which parts were not written by a human, you forfeit your copyright claim on *the entire codebase*.
This means copyright notices and even licenses folks are putting on their vibe-coded GitHub repos are unenforceable. The AI-generated code, and possibly the whole project, becomes public domain.
Source: https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf


@jamie Anything AI-generated is free, BUT anything AI-generated is also worse than simply worthless.
*shrug* -
If you use AI-generated code, you currently cannot claim copyright on it in the US. If you fail to disclose/disclaim exactly which parts were not written by a human, you forfeit your copyright claim on *the entire codebase*.
This means copyright notices and even licenses folks are putting on their vibe-coded GitHub repos are unenforceable. The AI-generated code, and possibly the whole project, becomes public domain.
Source: https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf


@jamie, so if a code review agent corrects a variable name in a proprietary 5M LOC project and that AI edit is not documented (where?), the entire project becomes public domain?
(Asking not you specifically but to entertain the thought such a law could be written without nuance.)
-
If you use AI-generated code, you currently cannot claim copyright on it in the US. If you fail to disclose/disclaim exactly which parts were not written by a human, you forfeit your copyright claim on *the entire codebase*.
This means copyright notices and even licenses folks are putting on their vibe-coded GitHub repos are unenforceable. The AI-generated code, and possibly the whole project, becomes public domain.
Source: https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf


@jamie Hmm... this sounds like it's saying is that if a work (e.g. a code base) includes AI-generated content and doesn't identify which parts of it were AI-generated, the whole work and every part of the work, even the human-created parts, become ineligible for copyright. I believe that's wrong. (Maybe misleading, at best, if you meant it in some other way.) I mean, I can't say so authoritatively, since I'm a copyright nerd, not a lawyer, but I'm becoming increasingly convinced the more I look into it. If nothing else, it'd be a quick way to invalidate anybody's copyright on anything by just combining it with some AI-generated content and releasing the combination.
I think a more accurate statement would be that if you fail to disclose which parts were not written by a human, the copyright status of the work is unclear. The human contributions are still copyrighted by their authors, but there are some things that can't be done with the work as a whole without knowing which contributions those are.
-
@fsinn @jamie My understanding was that training an AI model on copyrighted work was fair use, because the actual "distribution"--when the AI generates something from a prompt--uses a diminimus amount of copyrighted content from an individual work, except if the user explicitly prompted something like, "Give me Homer Simpson surfing a space orca," at which point the AI company would throw the user all the way under the bus.
-
If you use AI-generated code, you currently cannot claim copyright on it in the US. If you fail to disclose/disclaim exactly which parts were not written by a human, you forfeit your copyright claim on *the entire codebase*.
This means copyright notices and even licenses folks are putting on their vibe-coded GitHub repos are unenforceable. The AI-generated code, and possibly the whole project, becomes public domain.
Source: https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf


@jamie@zomglol.wtf Is Windows FOSS now?
-
If you use AI-generated code, you currently cannot claim copyright on it in the US. If you fail to disclose/disclaim exactly which parts were not written by a human, you forfeit your copyright claim on *the entire codebase*.
This means copyright notices and even licenses folks are putting on their vibe-coded GitHub repos are unenforceable. The AI-generated code, and possibly the whole project, becomes public domain.
Source: https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf


@jamie so windows 11 source code is public domain now?
What about AWS? -
@Azuaron @fsinn The argument has been that the model doesn't contain the copyrighted works directly. Like, you can't grep the model file on disk for a passage from a book it can still somehow reproduce.
It's a ridiculous argument, though, because the models deal in numbers, not text. Those numbers are converted to text for human consumption only, so of course it won't contain the raw text anywhere in the model.
-
@Azuaron @fsinn The argument has been that the model doesn't contain the copyrighted works directly. Like, you can't grep the model file on disk for a passage from a book it can still somehow reproduce.
It's a ridiculous argument, though, because the models deal in numbers, not text. Those numbers are converted to text for human consumption only, so of course it won't contain the raw text anywhere in the model.
-
@max @fsinn @jamie That's not true. Media organisations and individual journalist make a share of their income from granting licenses for secondary use of their digital works, for copying them or for offering them in libraries. Copyright is one of the few bedrocks of income. It doesn‘t vanish through wishful thinking or ignoring it.
-
@jamie I *am* an IP lawyer and I (along with many others) have been saying it for a while, that if the position the “AI” co’s are taking with respect to the legality of scraping “publicly available” materials were true (that all “publicly available” materials are “public domain” free to be used as raw materials without consent required), then copyright ceases to exist and all their own materials will be free for everyone else to use the very first time they’re leaked. That’ll be fun for the co.
@fsinn @jamie Thanks! Obtaining copyright for LLM-generated text is one thing, but I've read an assessment from a German state ministry yesterday that according to national laws copyright infringement by LLMs are passed on to users and text they generate in Germany, in their interpretation. If that holds, consequences might be rather big.
-
FWIW I'm not a lawyer and I'm not recommending that you do this.
Even if companies have no legal standing on copyright, their legal team will try it. It *will* cost you money.But man, oh man, I'm gonna have popcorn ready for when someone inevitably pulls this move.
@jamie Hopefully they won't. If you right now published your company's non-AI code, you can be sure copyright infringement won't the thing that kills you, that's just a cherry on top.
So if you do it with a codebase that has undisclosed AI code, you're still ruining your life except they won't have their cherry on top. Not sure it's worth it but YMMV.
-
@jamie so windows 11 source code is public domain now?
What about AWS? -
@kkarhan Yeah, this is very US-focused. I haven't worked with any lawyers outside the US and I'm not familiar with how copyright works outside the US at all.
However, if the company is in the US and they don't have a huge international presence, they probably aren't able to take legal action anyway.

@jamie @kkarhan
European/German law is similar:German UrhG Par2(2)
„[protected] works […] are only personal, inspired creations“ (quick, dirty translation)There is the special catch with the „inspired“ part. If it is not creative enough, it is not protected. This especially true for paintings („Gebrauchsgrafiken“), e.g. quickly drawn direction-pointing-arrows, texts like „this side up“ are not protected (unless very creatively designed).
IANAL though
-
If you use AI-generated code, you currently cannot claim copyright on it in the US. If you fail to disclose/disclaim exactly which parts were not written by a human, you forfeit your copyright claim on *the entire codebase*.
This means copyright notices and even licenses folks are putting on their vibe-coded GitHub repos are unenforceable. The AI-generated code, and possibly the whole project, becomes public domain.
Source: https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf


@jamie This is just The Merchant of Venice, but with code instead of flesh.
-
@jamie @kkarhan
European/German law is similar:German UrhG Par2(2)
„[protected] works […] are only personal, inspired creations“ (quick, dirty translation)There is the special catch with the „inspired“ part. If it is not creative enough, it is not protected. This especially true for paintings („Gebrauchsgrafiken“), e.g. quickly drawn direction-pointing-arrows, texts like „this side up“ are not protected (unless very creatively designed).
IANAL though
@vampirdaddy @jamie yeah, cuz in practice, you have "collecting societies" like #GEMA that literally will demand one to evidence there's no content being played that they represent or face huge [retroactive] fines and license payments.
OFC this is #NotLegalAdvice and @wbs_legal, a law firm spechalized in media, did a good writeup on this issue.
It's also the reason why one can buy 8-12hr #samplers with #BackgroundMusic that is "GEMA-free" for €120+ because even a small location will face €300+ in monthy (!) licensing fees if they choose to just play the local radio station (on top of TV/Radio licensing fees!)
- This is also why you get "digital signage screens" which are basically TVs without any tuner in them, because commercial users have to license per device instead of a flat per-household fee and the only way to not be affected by this is by being technically unable to recieve said programming...
- Similarly, this is why many commercial vehicles have no radio in them and why Rivian's amazon delivery vans only have an amplifier with bluetooth in them (so delivery drivers can listen to the navigation instructions on their issued handheld)...
-
@Suiseiseki@freesoftwareextremist.com @jamie@zomglol.wtf I was just saying what was on my last 2 contracts and i never report anything i wrote on company hardware because i think those rules are bs just as much as you do
-
@max @fsinn @jamie That's not true. Media organisations and individual journalist make a share of their income from granting licenses for secondary use of their digital works, for copying them or for offering them in libraries. Copyright is one of the few bedrocks of income. It doesn‘t vanish through wishful thinking or ignoring it.
@christianschwaegerl @fsinn @jamie That's the classical model, yes, and it's unfortunate that they have to rely on such an external influence on their integrity and this needs to change.
And it slowly is, both legally (e.g. publicly financed journalism can be one solution to avoid this conflict of interest) as well as illegally (content is reused without permission for "AI" training, or simply shared online for free so that every human has access to the information)
-
If you use AI-generated code, you currently cannot claim copyright on it in the US. If you fail to disclose/disclaim exactly which parts were not written by a human, you forfeit your copyright claim on *the entire codebase*.
This means copyright notices and even licenses folks are putting on their vibe-coded GitHub repos are unenforceable. The AI-generated code, and possibly the whole project, becomes public domain.
Source: https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf


@jamie@zomglol.wtf Fantastic read – thanks for sharing!