Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327

A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327

Scheduled Pinned Locked Moved Uncategorized
52 Posts 38 Posters 136 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • etoani@freeradical.zoneE etoani@freeradical.zone

    @cwebber I would very much like someone with a legal mind explain how software licenses interact with yesterday's ruling that AI gen work is not copyrightable. What exactly is the basis of the copyright here? I hope we get to see someone dive into this.

    kye@tech.lgbtK This user is from outside of this forum
    kye@tech.lgbtK This user is from outside of this forum
    kye@tech.lgbt
    wrote last edited by
    #20

    @etoani @cwebber It was a decline to rule. The case they declined to rule on stood, and it focused narrowly on someone trying to get his pet AI recognized as sentient to qualify for authorship under copyright law.

    Where the line is on how much authorship flips "authored parts are copyrightable" to "the whole thing is copyrighted" is still contested and evolving in courts.

    edit: The SCOTUS likes to let lawyers duke it out in district courts and wait for enough rulings, especially with serious cross-district conflicts, at that level to pick from to hear.

    1 Reply Last reply
    0
    • cwebber@social.coopC cwebber@social.coop

      omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078

      soapdog@toot.cafeS This user is from outside of this forum
      soapdog@toot.cafeS This user is from outside of this forum
      soapdog@toot.cafe
      wrote last edited by
      #21

      @cwebber that whole relicensing and this slop reply are vomit inducing.

      dajb@social.coopD ectopod@hachyderm.ioE 2 Replies Last reply
      0
      • soapdog@toot.cafeS soapdog@toot.cafe

        @cwebber that whole relicensing and this slop reply are vomit inducing.

        dajb@social.coopD This user is from outside of this forum
        dajb@social.coopD This user is from outside of this forum
        dajb@social.coop
        wrote last edited by
        #22

        @soapdog @cwebber It's just the lack of understanding of what an LLM is that's makes one's hand want to smack one's forehead. Or, preferably, his.

        1 Reply Last reply
        0
        • cwebber@social.coopC cwebber@social.coop

          omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078

          rcriii@hostux.socialR This user is from outside of this forum
          rcriii@hostux.socialR This user is from outside of this forum
          rcriii@hostux.social
          wrote last edited by
          #23

          @cwebber I love the sentence "If you are indeed the Mark Pilgrim..." So steeped in bad faith that you assume others are too.

          1 Reply Last reply
          0
          • R relay@relay.an.exchange shared this topic
          • cwebber@social.coopC cwebber@social.coop

            But really, relicensing a GPL codebase to MIT is uninteresting.

            Let's do the interesting one, which is: vibe code a "clean room" reimplementation of an entire proprietary codebase! After all, Microsoft released a "shared source" proprietary version of Windows. Now try seeing what happens if you run THAT through the "turn it into public domain" machine

            Win-win outcome, no matter how it goes

            cyberia@tilde.zoneC This user is from outside of this forum
            cyberia@tilde.zoneC This user is from outside of this forum
            cyberia@tilde.zone
            wrote last edited by
            #24

            @cwebber Well, the maintainer's point was that this is "clean room", by which they mean Claude was not given the existing codebase as input. The counter argument is that the existing codebase almost certainly forms part of Claude's training data, so the claim of it being genuinely clean room is bogus. So to make your idea work, you'd have to use the proprietary codebase as training data, rather than prompt input.

            cyberia@tilde.zoneC H 2 Replies Last reply
            0
            • cyberia@tilde.zoneC cyberia@tilde.zone

              @cwebber Well, the maintainer's point was that this is "clean room", by which they mean Claude was not given the existing codebase as input. The counter argument is that the existing codebase almost certainly forms part of Claude's training data, so the claim of it being genuinely clean room is bogus. So to make your idea work, you'd have to use the proprietary codebase as training data, rather than prompt input.

              cyberia@tilde.zoneC This user is from outside of this forum
              cyberia@tilde.zoneC This user is from outside of this forum
              cyberia@tilde.zone
              wrote last edited by
              #25

              @cwebber and I suspect that if you made an LLM based on the specific code as training data, a court would probably rule differently to how they have ruled about LLM generated code in other cases. maybe.

              1 Reply Last reply
              0
              • cyberia@tilde.zoneC cyberia@tilde.zone

                @cwebber Well, the maintainer's point was that this is "clean room", by which they mean Claude was not given the existing codebase as input. The counter argument is that the existing codebase almost certainly forms part of Claude's training data, so the claim of it being genuinely clean room is bogus. So to make your idea work, you'd have to use the proprietary codebase as training data, rather than prompt input.

                H This user is from outside of this forum
                H This user is from outside of this forum
                hashbangperl@hachyderm.io
                wrote last edited by
                #26

                @cyberia @cwebber it would need a controlled clean-room training data and training and context, so yeah it was trained on the original GPL code and is not a clean-room implementation

                1 Reply Last reply
                0
                • cwebber@social.coopC cwebber@social.coop

                  But really, relicensing a GPL codebase to MIT is uninteresting.

                  Let's do the interesting one, which is: vibe code a "clean room" reimplementation of an entire proprietary codebase! After all, Microsoft released a "shared source" proprietary version of Windows. Now try seeing what happens if you run THAT through the "turn it into public domain" machine

                  Win-win outcome, no matter how it goes

                  npdoty@techpolicy.socialN This user is from outside of this forum
                  npdoty@techpolicy.socialN This user is from outside of this forum
                  npdoty@techpolicy.social
                  wrote last edited by
                  #27

                  @cwebber I cynically fear that the likely outcome is that proprietary copyright holders with lots of lawyers and money could succeed in preventing re-licensing as open source, while copyleft advocates with few resources couldn't actually prevent re-licensing to closed.

                  1 Reply Last reply
                  0
                  • cwebber@social.coopC cwebber@social.coop

                    But really, relicensing a GPL codebase to MIT is uninteresting.

                    Let's do the interesting one, which is: vibe code a "clean room" reimplementation of an entire proprietary codebase! After all, Microsoft released a "shared source" proprietary version of Windows. Now try seeing what happens if you run THAT through the "turn it into public domain" machine

                    Win-win outcome, no matter how it goes

                    cstanhope@social.coopC This user is from outside of this forum
                    cstanhope@social.coopC This user is from outside of this forum
                    cstanhope@social.coop
                    wrote last edited by
                    #28

                    @cwebber I think you're going to need one hell of a kickstarter to fund that one.

                    1 Reply Last reply
                    0
                    • cwebber@social.coopC cwebber@social.coop

                      omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078

                      cstanhope@social.coopC This user is from outside of this forum
                      cstanhope@social.coopC This user is from outside of this forum
                      cstanhope@social.coop
                      wrote last edited by
                      #29

                      @cwebber I'm not sure that's slop, but I won't discount the possibility... 🤔 But this part is funny in the dark humor sort of way:

                      "...explicitly instructed Claude not to base anything on LGPL/GPL-licensed code."

                      So, you see, no problem... 🙄

                      lukeharby@infosec.exchangeL 1 Reply Last reply
                      0
                      • soapdog@toot.cafeS soapdog@toot.cafe

                        @cwebber that whole relicensing and this slop reply are vomit inducing.

                        ectopod@hachyderm.ioE This user is from outside of this forum
                        ectopod@hachyderm.ioE This user is from outside of this forum
                        ectopod@hachyderm.io
                        wrote last edited by
                        #30

                        @soapdog @cwebber There is a real issue with people using LLMs to try to brute force their way out of a situation. Make a response that is long enough and plausible enough, and people will roll their eyes and often just give up. I have experienced this directly at work, and it drives me crazy.

                        1 Reply Last reply
                        0
                        • cwebber@social.coopC cwebber@social.coop

                          Winning option 1: yes, you can vibe code proprietary codebases into the public domain, allowing us to bootstrap proprietary codebases quickly

                          Winning option 2: stopping laundering of copyleft codebases

                          Either of these are interesting outcomes!

                          haste@mastodon.socialH This user is from outside of this forum
                          haste@mastodon.socialH This user is from outside of this forum
                          haste@mastodon.social
                          wrote last edited by
                          #31

                          @cwebber I love the idea of weaponizing their reasoning in support of the working class.

                          Cynically though, I think there’s a third outcome: rules for thee, but not for me. In which Microsoft uses the full weight of their wallet to crush the common person, but is free to steal themselves, to profit off of the open source community. The rest of us are left to victimize each other with little legal recourse.

                          Is it logically consistent? Nope, but that’s the weird timeline we live in.

                          1 Reply Last reply
                          0
                          • cwebber@social.coopC cwebber@social.coop

                            omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078

                            aeva@mastodon.gamedev.placeA This user is from outside of this forum
                            aeva@mastodon.gamedev.placeA This user is from outside of this forum
                            aeva@mastodon.gamedev.place
                            wrote last edited by
                            #32

                            @cwebber these people don't know how to write on their own anymore lol

                            1 Reply Last reply
                            0
                            • cwebber@social.coopC cwebber@social.coop

                              omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078

                              kirtai@tech.lgbtK This user is from outside of this forum
                              kirtai@tech.lgbtK This user is from outside of this forum
                              kirtai@tech.lgbt
                              wrote last edited by
                              #33

                              @cwebber
                              If he can't be bothered to write it, why should we bother to read it?

                              1 Reply Last reply
                              0
                              • cwebber@social.coopC cwebber@social.coop

                                But really, relicensing a GPL codebase to MIT is uninteresting.

                                Let's do the interesting one, which is: vibe code a "clean room" reimplementation of an entire proprietary codebase! After all, Microsoft released a "shared source" proprietary version of Windows. Now try seeing what happens if you run THAT through the "turn it into public domain" machine

                                Win-win outcome, no matter how it goes

                                msh@coales.coM This user is from outside of this forum
                                msh@coales.coM This user is from outside of this forum
                                msh@coales.co
                                wrote last edited by
                                #34

                                @cwebber I think the only sticking point with this scheme is the concept of a vibe coded "clean room implementation" is problematic. Like, have you SEEN Claude's room? Is absolutely FILTHY!

                                1 Reply Last reply
                                0
                                • cwebber@social.coopC cwebber@social.coop

                                  But really, relicensing a GPL codebase to MIT is uninteresting.

                                  Let's do the interesting one, which is: vibe code a "clean room" reimplementation of an entire proprietary codebase! After all, Microsoft released a "shared source" proprietary version of Windows. Now try seeing what happens if you run THAT through the "turn it into public domain" machine

                                  Win-win outcome, no matter how it goes

                                  vonubelgarten@mastodon.sdf.orgV This user is from outside of this forum
                                  vonubelgarten@mastodon.sdf.orgV This user is from outside of this forum
                                  vonubelgarten@mastodon.sdf.org
                                  wrote last edited by
                                  #35

                                  @cwebber even funnier with *closed source* proprietary Java or C# apps (and Android, perhaps?!) as these can be decompiled to a very ugly IR code that can be somewhat usable to guide a LLM!

                                  1 Reply Last reply
                                  0
                                  • cwebber@social.coopC cwebber@social.coop

                                    Winning option 1: yes, you can vibe code proprietary codebases into the public domain, allowing us to bootstrap proprietary codebases quickly

                                    Winning option 2: stopping laundering of copyleft codebases

                                    Either of these are interesting outcomes!

                                    sprocketclown@mastodon.socialS This user is from outside of this forum
                                    sprocketclown@mastodon.socialS This user is from outside of this forum
                                    sprocketclown@mastodon.social
                                    wrote last edited by
                                    #36

                                    @cwebber What constitutes laundering of copyleft codebases?

                                    gumnos@mastodon.bsd.cafeG 1 Reply Last reply
                                    0
                                    • sprocketclown@mastodon.socialS sprocketclown@mastodon.social

                                      @cwebber What constitutes laundering of copyleft codebases?

                                      gumnos@mastodon.bsd.cafeG This user is from outside of this forum
                                      gumnos@mastodon.bsd.cafeG This user is from outside of this forum
                                      gumnos@mastodon.bsd.cafe
                                      wrote last edited by
                                      #37

                                      @SprocketClown

                                      The way I read it in this context is that an existing codebase has license (whether GPL, LGPL, or proprietary or whatever), and that by "laundering" the codebase through an LLM, the output no longer retains the retains the license terms. In the US at least, the Supreme Court has ruled that LLM output is uncopyrightable.

                                      So as @cwebber highlights, either the licensewashing works, in which case LLMs can scrub licenses off proprietary codebases giving a leg up on "reproducing" proprietary codebases into the public domain; or it doesn't work, in which case LLM-produced code becomes subject to the licensing of the original code.

                                      1 Reply Last reply
                                      0
                                      • cwebber@social.coopC cwebber@social.coop

                                        A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327

                                        feld@friedcheese.usF This user is from outside of this forum
                                        feld@friedcheese.usF This user is from outside of this forum
                                        feld@friedcheese.us
                                        wrote last edited by
                                        #38
                                        @cwebber

                                        > Their claim that it is a "complete rewrite" is irrelevant, since they had ample exposure to the originally licensed code (i.e. this is not a "clean room" implementation). Adding a fancy code generator into the mix does not somehow grant them any additional rights.

                                        The human didn't write the code, the LLM did. "They" which had "ample exposure to the originally licensed code" does not exist; "they" are ephemeral.

                                        1. Start a fresh session / clean context, make it meticulously document the architecture, APIs, etc

                                        2. keep those documents, throw away the code, start a new session with an LLM that has clean context and tell it to build off those documents.

                                        That's clean room. If the original code was not in the LLM's context, it's not violating the license.

                                        This is how you can do this. Proving beyond a reasonable doubt he didn't do it this way is going to require a lot of evidence nobody will have.
                                        vv@solarpunk.moeV 1 Reply Last reply
                                        0
                                        • cwebber@social.coopC cwebber@social.coop

                                          omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078

                                          feld@friedcheese.usF This user is from outside of this forum
                                          feld@friedcheese.usF This user is from outside of this forum
                                          feld@friedcheese.us
                                          wrote last edited by
                                          #39
                                          @cwebber how is than an "obvious slop response"? I don't see anything odd other than the "core claim" statement but I would probably have phrased it similarly
                                          cwebber@social.coopC 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups