Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327

A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327

Scheduled Pinned Locked Moved Uncategorized
52 Posts 38 Posters 136 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • cwebber@social.coopC cwebber@social.coop

    Winning option 1: yes, you can vibe code proprietary codebases into the public domain, allowing us to bootstrap proprietary codebases quickly

    Winning option 2: stopping laundering of copyleft codebases

    Either of these are interesting outcomes!

    haste@mastodon.socialH This user is from outside of this forum
    haste@mastodon.socialH This user is from outside of this forum
    haste@mastodon.social
    wrote last edited by
    #31

    @cwebber I love the idea of weaponizing their reasoning in support of the working class.

    Cynically though, I think there’s a third outcome: rules for thee, but not for me. In which Microsoft uses the full weight of their wallet to crush the common person, but is free to steal themselves, to profit off of the open source community. The rest of us are left to victimize each other with little legal recourse.

    Is it logically consistent? Nope, but that’s the weird timeline we live in.

    1 Reply Last reply
    0
    • cwebber@social.coopC cwebber@social.coop

      omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078

      aeva@mastodon.gamedev.placeA This user is from outside of this forum
      aeva@mastodon.gamedev.placeA This user is from outside of this forum
      aeva@mastodon.gamedev.place
      wrote last edited by
      #32

      @cwebber these people don't know how to write on their own anymore lol

      1 Reply Last reply
      0
      • cwebber@social.coopC cwebber@social.coop

        omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078

        kirtai@tech.lgbtK This user is from outside of this forum
        kirtai@tech.lgbtK This user is from outside of this forum
        kirtai@tech.lgbt
        wrote last edited by
        #33

        @cwebber
        If he can't be bothered to write it, why should we bother to read it?

        1 Reply Last reply
        0
        • cwebber@social.coopC cwebber@social.coop

          But really, relicensing a GPL codebase to MIT is uninteresting.

          Let's do the interesting one, which is: vibe code a "clean room" reimplementation of an entire proprietary codebase! After all, Microsoft released a "shared source" proprietary version of Windows. Now try seeing what happens if you run THAT through the "turn it into public domain" machine

          Win-win outcome, no matter how it goes

          msh@coales.coM This user is from outside of this forum
          msh@coales.coM This user is from outside of this forum
          msh@coales.co
          wrote last edited by
          #34

          @cwebber I think the only sticking point with this scheme is the concept of a vibe coded "clean room implementation" is problematic. Like, have you SEEN Claude's room? Is absolutely FILTHY!

          1 Reply Last reply
          0
          • cwebber@social.coopC cwebber@social.coop

            But really, relicensing a GPL codebase to MIT is uninteresting.

            Let's do the interesting one, which is: vibe code a "clean room" reimplementation of an entire proprietary codebase! After all, Microsoft released a "shared source" proprietary version of Windows. Now try seeing what happens if you run THAT through the "turn it into public domain" machine

            Win-win outcome, no matter how it goes

            vonubelgarten@mastodon.sdf.orgV This user is from outside of this forum
            vonubelgarten@mastodon.sdf.orgV This user is from outside of this forum
            vonubelgarten@mastodon.sdf.org
            wrote last edited by
            #35

            @cwebber even funnier with *closed source* proprietary Java or C# apps (and Android, perhaps?!) as these can be decompiled to a very ugly IR code that can be somewhat usable to guide a LLM!

            1 Reply Last reply
            0
            • cwebber@social.coopC cwebber@social.coop

              Winning option 1: yes, you can vibe code proprietary codebases into the public domain, allowing us to bootstrap proprietary codebases quickly

              Winning option 2: stopping laundering of copyleft codebases

              Either of these are interesting outcomes!

              sprocketclown@mastodon.socialS This user is from outside of this forum
              sprocketclown@mastodon.socialS This user is from outside of this forum
              sprocketclown@mastodon.social
              wrote last edited by
              #36

              @cwebber What constitutes laundering of copyleft codebases?

              gumnos@mastodon.bsd.cafeG 1 Reply Last reply
              0
              • sprocketclown@mastodon.socialS sprocketclown@mastodon.social

                @cwebber What constitutes laundering of copyleft codebases?

                gumnos@mastodon.bsd.cafeG This user is from outside of this forum
                gumnos@mastodon.bsd.cafeG This user is from outside of this forum
                gumnos@mastodon.bsd.cafe
                wrote last edited by
                #37

                @SprocketClown

                The way I read it in this context is that an existing codebase has license (whether GPL, LGPL, or proprietary or whatever), and that by "laundering" the codebase through an LLM, the output no longer retains the retains the license terms. In the US at least, the Supreme Court has ruled that LLM output is uncopyrightable.

                So as @cwebber highlights, either the licensewashing works, in which case LLMs can scrub licenses off proprietary codebases giving a leg up on "reproducing" proprietary codebases into the public domain; or it doesn't work, in which case LLM-produced code becomes subject to the licensing of the original code.

                1 Reply Last reply
                0
                • cwebber@social.coopC cwebber@social.coop

                  A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327

                  feld@friedcheese.usF This user is from outside of this forum
                  feld@friedcheese.usF This user is from outside of this forum
                  feld@friedcheese.us
                  wrote last edited by
                  #38
                  @cwebber

                  > Their claim that it is a "complete rewrite" is irrelevant, since they had ample exposure to the originally licensed code (i.e. this is not a "clean room" implementation). Adding a fancy code generator into the mix does not somehow grant them any additional rights.

                  The human didn't write the code, the LLM did. "They" which had "ample exposure to the originally licensed code" does not exist; "they" are ephemeral.

                  1. Start a fresh session / clean context, make it meticulously document the architecture, APIs, etc

                  2. keep those documents, throw away the code, start a new session with an LLM that has clean context and tell it to build off those documents.

                  That's clean room. If the original code was not in the LLM's context, it's not violating the license.

                  This is how you can do this. Proving beyond a reasonable doubt he didn't do it this way is going to require a lot of evidence nobody will have.
                  vv@solarpunk.moeV 1 Reply Last reply
                  0
                  • cwebber@social.coopC cwebber@social.coop

                    omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078

                    feld@friedcheese.usF This user is from outside of this forum
                    feld@friedcheese.usF This user is from outside of this forum
                    feld@friedcheese.us
                    wrote last edited by
                    #39
                    @cwebber how is than an "obvious slop response"? I don't see anything odd other than the "core claim" statement but I would probably have phrased it similarly
                    cwebber@social.coopC 1 Reply Last reply
                    0
                    • feld@friedcheese.usF feld@friedcheese.us
                      @cwebber how is than an "obvious slop response"? I don't see anything odd other than the "core claim" statement but I would probably have phrased it similarly
                      cwebber@social.coopC This user is from outside of this forum
                      cwebber@social.coopC This user is from outside of this forum
                      cwebber@social.coop
                      wrote last edited by
                      #40

                      @feld The headings, the emdashes, the framing of sentences, all classic AI "speech patterns" especially in markdown documents

                      cwebber@social.coopC 1 Reply Last reply
                      0
                      • cwebber@social.coopC cwebber@social.coop

                        @feld The headings, the emdashes, the framing of sentences, all classic AI "speech patterns" especially in markdown documents

                        cwebber@social.coopC This user is from outside of this forum
                        cwebber@social.coopC This user is from outside of this forum
                        cwebber@social.coop
                        wrote last edited by
                        #41

                        @feld the author clearly at least was *assisted* in writing this response

                        1 Reply Last reply
                        0
                        • cwebber@social.coopC cwebber@social.coop

                          A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327

                          ralph_social@dresden.networkR This user is from outside of this forum
                          ralph_social@dresden.networkR This user is from outside of this forum
                          ralph_social@dresden.network
                          wrote last edited by
                          #42

                          Krass, dass sich AI-Firmen einfach Open Source Code schnappen und die Lizenzen "waschen" wollen. 😤

                          Das ist genau das Problem mit dem aktuellen AI-Hype: Die großen Player denken, sie können einfach alles verwenden was im Netz steht. Und wenn's rechtlich eng wird, wird halt schnell die Lizenz geändert...

                          Respekt an Mark Pilgrim dass er sich dagegen wehrt! Open Source lebt von Vertrauen und klaren Regeln - nicht von solchen Manövern.

                          #OpenSource #AIEthics #Licensing

                          1 Reply Last reply
                          0
                          • cwebber@social.coopC cwebber@social.coop

                            A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327

                            G This user is from outside of this forum
                            G This user is from outside of this forum
                            gerardthornley@hachyderm.io
                            wrote last edited by
                            #43

                            @cwebber Reading through all the comments there left me wondering if anyone has (yet) hooked up an LLM to be a project maintainer. Interactions via issues and just let it loose. People would be utterly mad to ever include it in their supply chain, and yet people do do mad things.

                            1 Reply Last reply
                            0
                            • cwebber@social.coopC cwebber@social.coop

                              A new twist in the "AI license laundering of chardet" story https://github.com/chardet/chardet/issues/327

                              avirr@sfba.socialA This user is from outside of this forum
                              avirr@sfba.socialA This user is from outside of this forum
                              avirr@sfba.social
                              wrote last edited by
                              #44

                              @cwebber Isn’t this what forks are for?

                              1 Reply Last reply
                              0
                              • feld@friedcheese.usF feld@friedcheese.us
                                @cwebber

                                > Their claim that it is a "complete rewrite" is irrelevant, since they had ample exposure to the originally licensed code (i.e. this is not a "clean room" implementation). Adding a fancy code generator into the mix does not somehow grant them any additional rights.

                                The human didn't write the code, the LLM did. "They" which had "ample exposure to the originally licensed code" does not exist; "they" are ephemeral.

                                1. Start a fresh session / clean context, make it meticulously document the architecture, APIs, etc

                                2. keep those documents, throw away the code, start a new session with an LLM that has clean context and tell it to build off those documents.

                                That's clean room. If the original code was not in the LLM's context, it's not violating the license.

                                This is how you can do this. Proving beyond a reasonable doubt he didn't do it this way is going to require a lot of evidence nobody will have.
                                vv@solarpunk.moeV This user is from outside of this forum
                                vv@solarpunk.moeV This user is from outside of this forum
                                vv@solarpunk.moe
                                wrote last edited by
                                #45

                                @feld @cwebber the AI is still trained on the code beforehand

                                vv@solarpunk.moeV 1 Reply Last reply
                                0
                                • vv@solarpunk.moeV vv@solarpunk.moe

                                  @feld @cwebber the AI is still trained on the code beforehand

                                  vv@solarpunk.moeV This user is from outside of this forum
                                  vv@solarpunk.moeV This user is from outside of this forum
                                  vv@solarpunk.moe
                                  wrote last edited by
                                  #46

                                  @feld @cwebber a "clean context" doesn't mean that there's no training data, it's still trained on a bunch of source code which likely includes the original

                                  feld@friedcheese.usF 1 Reply Last reply
                                  0
                                  • vv@solarpunk.moeV vv@solarpunk.moe

                                    @feld @cwebber a "clean context" doesn't mean that there's no training data, it's still trained on a bunch of source code which likely includes the original

                                    feld@friedcheese.usF This user is from outside of this forum
                                    feld@friedcheese.usF This user is from outside of this forum
                                    feld@friedcheese.us
                                    wrote last edited by
                                    #47
                                    @vv @cwebber proving the original was trained by the model or is in the model is quite difficult to do and is questionable whether or not it really matters anyway.

                                    Chris Lattner was "trained on" GCC when he wrote LLVM. He studied it a lot. GCC compiles code C/C++ successfully, LLVM compiles C/C++ code successfully.

                                    Both produce completely working bytecode and generally you don't *need* one compiler over the other to get an end result that is acceptable.

                                    Should LLVM be allowed to have an Apache license because of this?

                                    These are tough questions.
                                    1 Reply Last reply
                                    0
                                    • cwebber@social.coopC cwebber@social.coop

                                      omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078

                                      pikesley@mastodon.me.ukP This user is from outside of this forum
                                      pikesley@mastodon.me.ukP This user is from outside of this forum
                                      pikesley@mastodon.me.uk
                                      wrote last edited by
                                      #48

                                      @cwebber I felt my brain getting smoother as I read that

                                      1 Reply Last reply
                                      0
                                      • cwebber@social.coopC cwebber@social.coop

                                        Winning option 1: yes, you can vibe code proprietary codebases into the public domain, allowing us to bootstrap proprietary codebases quickly

                                        Winning option 2: stopping laundering of copyleft codebases

                                        Either of these are interesting outcomes!

                                        svines@gts.svines.rodeoS This user is from outside of this forum
                                        svines@gts.svines.rodeoS This user is from outside of this forum
                                        svines@gts.svines.rodeo
                                        wrote last edited by
                                        #49

                                        @cwebber Microslop committed to picking up the legal bill for anyone concerned about copyright issues with AI outputs from copilot so one could hypothetically use their tools to "clean room" implement Photoshop and then have Satya fight Adobe for your right to do so. Sounds fun to me!

                                        https://blogs.microsoft.com/on-the-issues/2023/09/07/copilot-copyright-commitment-ai-legal-concerns/

                                        valpackett@social.treehouse.systemsV borisbarbour@mastodon.socialB 2 Replies Last reply
                                        0
                                        • svines@gts.svines.rodeoS svines@gts.svines.rodeo

                                          @cwebber Microslop committed to picking up the legal bill for anyone concerned about copyright issues with AI outputs from copilot so one could hypothetically use their tools to "clean room" implement Photoshop and then have Satya fight Adobe for your right to do so. Sounds fun to me!

                                          https://blogs.microsoft.com/on-the-issues/2023/09/07/copilot-copyright-commitment-ai-legal-concerns/

                                          valpackett@social.treehouse.systemsV This user is from outside of this forum
                                          valpackett@social.treehouse.systemsV This user is from outside of this forum
                                          valpackett@social.treehouse.systems
                                          wrote last edited by
                                          #50

                                          @svines @cwebber call that Project Photoslop

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups