Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

Scheduled Pinned Locked Moved Uncategorized
92 Posts 53 Posters 204 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • xgranade@wandering.shopX This user is from outside of this forum
    xgranade@wandering.shopX This user is from outside of this forum
    xgranade@wandering.shop
    wrote last edited by
    #70

    @aud It's at least not systems code, so there's not a lot of potential for buffer overflow and other memory unsafety exploits, but yeah. No. chardet is not a small surface area.

    aud@fire.asta.lgbtA 1 Reply Last reply
    0
    • thomasjwebb@mastodon.socialT thomasjwebb@mastodon.social

      @Foxboron @scy hol' up... the *output* isn't copyrightable? That would be awesome if they decided that.

      wordshaper@weatherishappening.networkW This user is from outside of this forum
      wordshaper@weatherishappening.networkW This user is from outside of this forum
      wordshaper@weatherishappening.network
      wrote last edited by
      #71

      @thomasjwebb @Foxboron @scy In the US, at least, human authorship is required for copyright, and if you try to copyright something that's a mix of AI and human generated then generally only the human generated part is copyrightable.

      https://www.congress.gov/crs-product/LSB10922#:~:text=Granting%20that%20human%20authors%20may,applying%20to%20register%20their%20copyright.

      This is separate from the LLMs emitting text other people have written, so at *best* this code can't be licensed because it's not copyrightable, and at worst its license laundering and there's precedent (IIRC) for stomping on that hard.

      1 Reply Last reply
      0
      • xgranade@wandering.shopX xgranade@wandering.shop

        @aud It's at least not systems code, so there's not a lot of potential for buffer overflow and other memory unsafety exploits, but yeah. No. chardet is not a small surface area.

        aud@fire.asta.lgbtA This user is from outside of this forum
        aud@fire.asta.lgbtA This user is from outside of this forum
        aud@fire.asta.lgbt
        wrote last edited by
        #72

        @xgranade@wandering.shop There's just no way that's a good idea. I'm pretty sure a human who tried to push a 15K rewrite into most libraries would be yelled at forever and the PR rejected, or asked to be broken into smaller PRs, because it's just such a large change in one go and no one can possibly fit that entire thing into their head.

        It doesn't magically become a good idea just because claude shat it out.

        1 Reply Last reply
        0
        • scy@chaos.socialS scy@chaos.social

          @Foxboron Yeah but that's what I mean: Just because the end result is not copyrightable, does that automatically mean that it can't be a copyright violation?

          Like, changing the format or medium of something is not a copyrightable work.

          So, by that logic, if I take a copyrighted MP3 and convert it to AAC and publish that, my AAC is not copyrightable, but it's not a copyright violation to take it and publish it?

          That's what I mean.

          jens@social.finkhaeuser.deJ This user is from outside of this forum
          jens@social.finkhaeuser.deJ This user is from outside of this forum
          jens@social.finkhaeuser.de
          wrote last edited by
          #73

          @scy @Foxboron It's a bit complicated, actually. IANAL, but this is what I understand:

          - The music notation is copyrightable, individual notes are not. A sequence of notes is debatable, and it depends highly on recognizability AFAIK.

          - A music recording is copyrightable. Playing that music in a distinctly different arrangement, less of an issue.

          - Arguably, a change in digital format is either still the same recording, or sufficiently indistinguishable from it.

          - Copyright has an ancient...

          jens@social.finkhaeuser.deJ 1 Reply Last reply
          0
          • jens@social.finkhaeuser.deJ jens@social.finkhaeuser.de

            @scy @Foxboron It's a bit complicated, actually. IANAL, but this is what I understand:

            - The music notation is copyrightable, individual notes are not. A sequence of notes is debatable, and it depends highly on recognizability AFAIK.

            - A music recording is copyrightable. Playing that music in a distinctly different arrangement, less of an issue.

            - Arguably, a change in digital format is either still the same recording, or sufficiently indistinguishable from it.

            - Copyright has an ancient...

            jens@social.finkhaeuser.deJ This user is from outside of this forum
            jens@social.finkhaeuser.deJ This user is from outside of this forum
            jens@social.finkhaeuser.de
            wrote last edited by
            #74

            @scy @Foxboron ... naming and goes back to a time where making copies and distributing them was the hard part.

            This is a non-problem in the digital age, which is why it's fine to create backup copies of copyrighted works, so long as the people accessing them are always the people having purchased/licensed an original copy.

            So LLMs training on GPL is not itself a copyright violation, and them reproducing similar code isn't either, but then publishing such sufficiently similar code is.

            jens@social.finkhaeuser.deJ 1 Reply Last reply
            0
            • jens@social.finkhaeuser.deJ jens@social.finkhaeuser.de

              @scy @Foxboron ... naming and goes back to a time where making copies and distributing them was the hard part.

              This is a non-problem in the digital age, which is why it's fine to create backup copies of copyrighted works, so long as the people accessing them are always the people having purchased/licensed an original copy.

              So LLMs training on GPL is not itself a copyright violation, and them reproducing similar code isn't either, but then publishing such sufficiently similar code is.

              jens@social.finkhaeuser.deJ This user is from outside of this forum
              jens@social.finkhaeuser.deJ This user is from outside of this forum
              jens@social.finkhaeuser.de
              wrote last edited by
              #75

              @scy @Foxboron TL;DR what others already wrote: if the result is similar enough to inputs, the copyright holder of the inputs could challenge it, yes.

              jens@social.finkhaeuser.deJ 1 Reply Last reply
              0
              • jens@social.finkhaeuser.deJ jens@social.finkhaeuser.de

                @scy @Foxboron TL;DR what others already wrote: if the result is similar enough to inputs, the copyright holder of the inputs could challenge it, yes.

                jens@social.finkhaeuser.deJ This user is from outside of this forum
                jens@social.finkhaeuser.deJ This user is from outside of this forum
                jens@social.finkhaeuser.de
                wrote last edited by
                #76

                @scy @Foxboron If courts decide to throw this out, I would personally *love* for someone to use the exact same argument to produce a minimally altered copy of Avatar, and have Hollywood throw a fit.

                jens@social.finkhaeuser.deJ 1 Reply Last reply
                0
                • jens@social.finkhaeuser.deJ jens@social.finkhaeuser.de

                  @scy @Foxboron If courts decide to throw this out, I would personally *love* for someone to use the exact same argument to produce a minimally altered copy of Avatar, and have Hollywood throw a fit.

                  jens@social.finkhaeuser.deJ This user is from outside of this forum
                  jens@social.finkhaeuser.deJ This user is from outside of this forum
                  jens@social.finkhaeuser.de
                  wrote last edited by
                  #77

                  @scy @Foxboron Basically, let's not fight this, let the industry giants fight each other. Throw a few near-copies of Metallica songs in for good measure, so we get a v2 of that "Napster baaaad" animation with greedy gnome Lars Ulrich.

                  jens@social.finkhaeuser.deJ 1 Reply Last reply
                  0
                  • jens@social.finkhaeuser.deJ jens@social.finkhaeuser.de

                    @scy @Foxboron Basically, let's not fight this, let the industry giants fight each other. Throw a few near-copies of Metallica songs in for good measure, so we get a v2 of that "Napster baaaad" animation with greedy gnome Lars Ulrich.

                    jens@social.finkhaeuser.deJ This user is from outside of this forum
                    jens@social.finkhaeuser.deJ This user is from outside of this forum
                    jens@social.finkhaeuser.de
                    wrote last edited by
                    #78

                    @scy @Foxboron Either LLMs will die on the spot, or Copyright does.

                    1 Reply Last reply
                    0
                    • foxboron@chaos.socialF foxboron@chaos.social

                      Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

                      Link Preview Image
                      Release 7.0.0 · chardet/chardet

                      Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

                      favicon

                      GitHub (github.com)

                      That is one way to launder GPL code I guess?

                      mutesplash@uncontrollablegas.comM This user is from outside of this forum
                      mutesplash@uncontrollablegas.comM This user is from outside of this forum
                      mutesplash@uncontrollablegas.com
                      wrote last edited by
                      #79

                      @Foxboron Fun, one of the fundamental problems I have with this technology!

                      1 Reply Last reply
                      0
                      • xgranade@wandering.shopX xgranade@wandering.shop

                        @Foxboron It looks like this was the PR?

                        Link Preview Image
                        chardet 7.0: ground-up MIT-licensed rewrite by dan-blanchard · Pull Request #322 · chardet/chardet

                        Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

                        favicon

                        GitHub (github.com)

                        Even aside from the ethical and moral issues with LLMs, it doesn't seem optimal that a 15k line PR affecting almost a million dependent repos (if GitHub's count is to be believed) was up for three days before getting merged in.

                        foxboron@chaos.socialF This user is from outside of this forum
                        foxboron@chaos.socialF This user is from outside of this forum
                        foxboron@chaos.social
                        wrote last edited by
                        #80

                        @xgranade
                        They have been the upstream maintainer for years, so I don't see any huge issue with that.

                        I would have done the same probably?

                        xgranade@wandering.shopX 1 Reply Last reply
                        0
                        • foxboron@chaos.socialF foxboron@chaos.social

                          @xgranade
                          They have been the upstream maintainer for years, so I don't see any huge issue with that.

                          I would have done the same probably?

                          xgranade@wandering.shopX This user is from outside of this forum
                          xgranade@wandering.shopX This user is from outside of this forum
                          xgranade@wandering.shop
                          wrote last edited by
                          #81

                          @Foxboron Posted an unkind reply and deleted, sorry. I'm getting frustrated with the whole AI thing today, and I'm not being my best self. I should probably just step offline for a bit.

                          This is just so... frustrating.

                          foxboron@chaos.socialF 1 Reply Last reply
                          0
                          • foxboron@chaos.socialF foxboron@chaos.social

                            @scy
                            US court is leaning towards that LLM generated code is fundamentally not copyrightable.

                            This is a different problem to the moral issues I have with this.

                            jti42@infosec.exchangeJ This user is from outside of this forum
                            jti42@infosec.exchangeJ This user is from outside of this forum
                            jti42@infosec.exchange
                            wrote last edited by
                            #82

                            @Foxboron @scy@chaos.social That'd be the US system. Then there's the various Euro systems that differ substantially. I'm certainly curious how this will turn out.

                            On the other hand: it'd require that those who can enforce their rights here actually do so.
                            Given that IP rights are normally enforced pretty harshly, even on consumers (anyone remember the days of the torrent c&d letters or the traditional find&ban the infringing exhibitor days on computex et al?) they're effectively completely ignored on FOSS.
                            There is virtually no education for biz, cs or law students on this topic, let alone mandatory ed.

                            Presenting the case of possibilities and rights to those who have them is often dismissed by those, especially developers on the younger side or those who are still in a "hobby" / "non commercial" stage. Only to shortly after complain about sustainability and demanding funding.

                            Instead we see demands to throw substantial amounts of tax money after random Foss projects on more or less random criteria and evaluators. Which will totally scale, right?

                            Virtually every company that was enforced against in terms of FOSS compliance ended up consciously allocating resources to FOSS in various ways. There are a lot of companies and they are a renewable resource in a functional economy.

                            But what do I know, rite? I just see the cases.
                            /rant

                            1 Reply Last reply
                            0
                            • foxboron@chaos.socialF foxboron@chaos.social

                              Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

                              Link Preview Image
                              Release 7.0.0 · chardet/chardet

                              Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

                              favicon

                              GitHub (github.com)

                              That is one way to launder GPL code I guess?

                              jti42@infosec.exchangeJ This user is from outside of this forum
                              jti42@infosec.exchangeJ This user is from outside of this forum
                              jti42@infosec.exchange
                              wrote last edited by
                              #83

                              @Foxboron today's new term "code laundering" I'll keep that one 😆

                              1 Reply Last reply
                              0
                              • xgranade@wandering.shopX xgranade@wandering.shop

                                @Foxboron Posted an unkind reply and deleted, sorry. I'm getting frustrated with the whole AI thing today, and I'm not being my best self. I should probably just step offline for a bit.

                                This is just so... frustrating.

                                foxboron@chaos.socialF This user is from outside of this forum
                                foxboron@chaos.socialF This user is from outside of this forum
                                foxboron@chaos.social
                                wrote last edited by
                                #84

                                @xgranade
                                Yes.

                                But lets not clutch pearls over how a understaffed FOSS project decides to merge their work.

                                davidgerard@circumstances.runD 1 Reply Last reply
                                0
                                • foxboron@chaos.socialF foxboron@chaos.social

                                  @scy
                                  US court is leaning towards that LLM generated code is fundamentally not copyrightable.

                                  This is a different problem to the moral issues I have with this.

                                  kekunplazas@mamot.frK This user is from outside of this forum
                                  kekunplazas@mamot.frK This user is from outside of this forum
                                  kekunplazas@mamot.fr
                                  wrote last edited by
                                  #85

                                  @Foxboron @scy So it's not copyrightable, what are they using to apply the MIT license if not their copyright‽ That makes no sense to me. (I'm reacting to you but to what you shared, to be clear.)

                                  1 Reply Last reply
                                  0
                                  • foxboron@chaos.socialF foxboron@chaos.social

                                    Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

                                    Link Preview Image
                                    Release 7.0.0 · chardet/chardet

                                    Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

                                    favicon

                                    GitHub (github.com)

                                    That is one way to launder GPL code I guess?

                                    wolfcoder@lagopine.lgbtW This user is from outside of this forum
                                    wolfcoder@lagopine.lgbtW This user is from outside of this forum
                                    wolfcoder@lagopine.lgbt
                                    wrote last edited by
                                    #86

                                    @Foxboron more like laundered without emptying the lint trap, can't imagine the bugs and vulns a whole AI re-write would do.

                                    1 Reply Last reply
                                    0
                                    • foxboron@chaos.socialF foxboron@chaos.social

                                      Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

                                      Link Preview Image
                                      Release 7.0.0 · chardet/chardet

                                      Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

                                      favicon

                                      GitHub (github.com)

                                      That is one way to launder GPL code I guess?

                                      foxboron@chaos.socialF This user is from outside of this forum
                                      foxboron@chaos.socialF This user is from outside of this forum
                                      foxboron@chaos.social
                                      wrote last edited by
                                      #87

                                      Seems like the original author saw this as well.

                                      Link Preview Image
                                      No right to relicense this project · Issue #327 · chardet/chardet

                                      Hi, I'm Mark Pilgrim. You may remember me from such classics as "Dive Into Python" and "Universal Character Encoding Detector." I am the original author of chardet. First off, I would like to thank the current maintainers and everyone wh...

                                      favicon

                                      GitHub (github.com)

                                      Please do not brigade the project.

                                      1 Reply Last reply
                                      0
                                      • foxboron@chaos.socialF foxboron@chaos.social

                                        Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

                                        Link Preview Image
                                        Release 7.0.0 · chardet/chardet

                                        Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

                                        favicon

                                        GitHub (github.com)

                                        That is one way to launder GPL code I guess?

                                        orca@nya.oneO This user is from outside of this forum
                                        orca@nya.oneO This user is from outside of this forum
                                        orca@nya.one
                                        wrote last edited by
                                        #88
                                        @Foxboron@chaos.social
                                        Rewrite the entire codebase
                                        Argh I can see a disaster coming
                                        1 Reply Last reply
                                        0
                                        • foxboron@chaos.socialF foxboron@chaos.social

                                          @xgranade
                                          Yes.

                                          But lets not clutch pearls over how a understaffed FOSS project decides to merge their work.

                                          davidgerard@circumstances.runD This user is from outside of this forum
                                          davidgerard@circumstances.runD This user is from outside of this forum
                                          davidgerard@circumstances.run
                                          wrote last edited by
                                          #89

                                          @Foxboron @xgranade If it was solely his work, he could just change the license. He didn't do that - he felt he had to AI-wash it. That suggests there is in fact other people's work in there that he's trying to AI-wash away their copyright.

                                          foxboron@chaos.socialF 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups