Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

Scheduled Pinned Locked Moved Uncategorized
92 Posts 53 Posters 204 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • jn@boopsnoot.deJ jn@boopsnoot.de

    @Foxboron "Ground-up" in the sense of "run through a grinder"

    lindsey@recurse.socialL This user is from outside of this forum
    lindsey@recurse.socialL This user is from outside of this forum
    lindsey@recurse.social
    wrote last edited by
    #57

    @jn @Foxboron That's exactly the sense that I read it in, and it took me a minute to realize that's not what they meant

    1 Reply Last reply
    0
    • rootwyrm@weird.autosR rootwyrm@weird.autos

      @Bubu @Foxboron somebody should inform PSF that in fact, chardet now has NO licensing and cannot be legally copyrighted or trademarked in any jurisdiction.

      Link Preview Image
      The Copyright Office’s Latest Guidance on AI and Copyrightability

      US Copyright Office reaffirms AI-generated works without human creative input are not eligible for copyright protection. Emphasizes human creativity in AI use

      favicon

      The National Law Review (natlawreview.com)

      https://fingfx.thomsonreuters.com/gfx/legaldocs/zdpxjnmmxpx/USPTO%20AI%20PATENTS%20squires.pdf

      rootwyrm@weird.autosR This user is from outside of this forum
      rootwyrm@weird.autosR This user is from outside of this forum
      rootwyrm@weird.autos
      wrote last edited by
      #58

      @Bubu @Foxboron oh, and I forgot to mention, it's also guaranteed to have numerous instances of code copied verbatim from other projects. Meaning it is also both infringing and subject to other licenses which are likely to include LGPL, GPLv3, and so on.

      1 Reply Last reply
      0
      • foxboron@chaos.socialF foxboron@chaos.social

        Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

        Link Preview Image
        Release 7.0.0 · chardet/chardet

        Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

        favicon

        GitHub (github.com)

        That is one way to launder GPL code I guess?

        froge@social.glitched.systemsF This user is from outside of this forum
        froge@social.glitched.systemsF This user is from outside of this forum
        froge@social.glitched.systems
        wrote last edited by
        #59

        @Foxboron@chaos.social the most hilarious part is that it's not even really MIT licensed, most of this is AI output with no way to distinguish it from human output, in a lot of nations this is machine produced text and just isn't legally valid for anything

        he literally doesn't have the authority to relicense this as MIT no matter how much he wants to, because he's not the copyright holder of the code, a machine created most of it

        1 Reply Last reply
        0
        • thomasjwebb@mastodon.socialT thomasjwebb@mastodon.social

          @Foxboron @scy hol' up... the *output* isn't copyrightable? That would be awesome if they decided that.

          paul@oldfriends.liveP This user is from outside of this forum
          paul@oldfriends.liveP This user is from outside of this forum
          paul@oldfriends.live
          wrote last edited by
          #60

          @thomasjwebb Right now, that is how SCOTUS is leaning regarding AI generated output. They refused to interfere with a patent application and "artist" copyright, leaving it up to the copyright and patent offices to decide, which they said no. Some guy used AI to create a beverage holder and light beacon using AI. When the patent was denied, he tried to copyright the AI created "artist" renditions to get around the patent.

          @Foxboron @scy

          reuters.com

          favicon

          (www.reuters.com)


          https://www.supremecourt.gov/docket/docketfiles/html/public/25-449.html

          1 Reply Last reply
          0
          • scy@chaos.socialS scy@chaos.social

            @Foxboron Yeah but that's what I mean: Just because the end result is not copyrightable, does that automatically mean that it can't be a copyright violation?

            Like, changing the format or medium of something is not a copyrightable work.

            So, by that logic, if I take a copyrighted MP3 and convert it to AAC and publish that, my AAC is not copyrightable, but it's not a copyright violation to take it and publish it?

            That's what I mean.

            bob_zim@infosec.exchangeB This user is from outside of this forum
            bob_zim@infosec.exchangeB This user is from outside of this forum
            bob_zim@infosec.exchange
            wrote last edited by
            #61

            @scy @Foxboron It is absolutely a violation for the company which built the model to build a model which emits license-restricted code without following the terms of the license. The model doesn’t commit the violation any more than a photocopier does, of course.

            The emitted code cannot be copyrighted at all, but if it emitted the code in a way which meets the terms of the license, the code would be covered by the original license.

            1 Reply Last reply
            0
            • douginamug@mastodon.xyzD douginamug@mastodon.xyz

              @Foxboron https://fosdem.org/2026/schedule/event/SUVS7G-lets_end_open_source_together_with_this_one_simple_trick/ didn't watch this talk yet, but seems relevant!

              EDIT: just watched it. Note: _loads_ of genAI video... feels like my brain is a bit broken. But entertaining. Goes through the history of copyright (from books in the 1700s) through to cleanrooming in the 1970s and then strongly makes the point that cleanrooming is "almost free" now.

              True to the talk title, the talk offers no solutions, ending with "this is the end of open source as we know it" 😕

              douginamug@mastodon.xyzD This user is from outside of this forum
              douginamug@mastodon.xyzD This user is from outside of this forum
              douginamug@mastodon.xyz
              wrote last edited by
              #62

              @Foxboron the presenters have a live demo: https://malus.sh/

              1 Reply Last reply
              0
              • foxboron@chaos.socialF foxboron@chaos.social

                Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

                Link Preview Image
                Release 7.0.0 · chardet/chardet

                Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

                favicon

                GitHub (github.com)

                That is one way to launder GPL code I guess?

                gooba42@mastodon.socialG This user is from outside of this forum
                gooba42@mastodon.socialG This user is from outside of this forum
                gooba42@mastodon.social
                wrote last edited by
                #63

                @Foxboron Except the output can't be copyrighted and so the result is public domain. It can't even be licensed anymore.

                1 Reply Last reply
                0
                • thomasjwebb@mastodon.socialT thomasjwebb@mastodon.social

                  @Foxboron @scy hol' up... the *output* isn't copyrightable? That would be awesome if they decided that.

                  blogdiva@mastodon.socialB This user is from outside of this forum
                  blogdiva@mastodon.socialB This user is from outside of this forum
                  blogdiva@mastodon.social
                  wrote last edited by
                  #64

                  YUP

                  copyright is for humans, not automata ―hard or soft.

                  so, ironically, the prompts are copyrightable but not the output.

                  so anything you want to copyright should not be prompted into a corporate regurgitation machine, including so-called grammar checkers.

                  @thomasjwebb @Foxboron @scy

                  1 Reply Last reply
                  0
                  • foxboron@chaos.socialF foxboron@chaos.social

                    Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

                    Link Preview Image
                    Release 7.0.0 · chardet/chardet

                    Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

                    favicon

                    GitHub (github.com)

                    That is one way to launder GPL code I guess?

                    gooba42@mastodon.socialG This user is from outside of this forum
                    gooba42@mastodon.socialG This user is from outside of this forum
                    gooba42@mastodon.social
                    wrote last edited by
                    #65

                    @Foxboron Went ahead and added an issue since you can't apply an MIT license to public domain LLM output.

                    1 Reply Last reply
                    0
                    • foxboron@chaos.socialF foxboron@chaos.social

                      Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

                      Link Preview Image
                      Release 7.0.0 · chardet/chardet

                      Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

                      favicon

                      GitHub (github.com)

                      That is one way to launder GPL code I guess?

                      Z This user is from outside of this forum
                      Z This user is from outside of this forum
                      zkat@toot.cat
                      wrote last edited by
                      #66

                      @Foxboron that's... not copyrightable, therefore not licensable?

                      1 Reply Last reply
                      0
                      • foxboron@chaos.socialF foxboron@chaos.social

                        @joshbressers @scy

                        Sure, but we are not really looking at, nor discussing, cases where LLMs spits out something verbatim from another project in this case.

                        glyph@mastodon.socialG This user is from outside of this forum
                        glyph@mastodon.socialG This user is from outside of this forum
                        glyph@mastodon.social
                        wrote last edited by
                        #67

                        @Foxboron @joshbressers @scy verbatim isn’t the question here, the question is infringement. is the output here substantially derivative of previous versions of chardet to the point that it could be considered infringing? US copyright precedent is a muddled mess and I think this could implicate at least one unresolved circuit split. I don’t know what the answer will be but I know I wouldn’t want to be standing in the blast radius of that decision

                        1 Reply Last reply
                        0
                        • foxboron@chaos.socialF foxboron@chaos.social

                          Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

                          Link Preview Image
                          Release 7.0.0 · chardet/chardet

                          Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

                          favicon

                          GitHub (github.com)

                          That is one way to launder GPL code I guess?

                          slightlyoff@toot.cafeS This user is from outside of this forum
                          slightlyoff@toot.cafeS This user is from outside of this forum
                          slightlyoff@toot.cafe
                          wrote last edited by
                          #68

                          @Foxboron If you can't copyright it, you can't license the copyright. Interesting times.

                          1 Reply Last reply
                          0
                          • foxboron@chaos.socialF foxboron@chaos.social

                            Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

                            Link Preview Image
                            Release 7.0.0 · chardet/chardet

                            Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

                            favicon

                            GitHub (github.com)

                            That is one way to launder GPL code I guess?

                            xgranade@wandering.shopX This user is from outside of this forum
                            xgranade@wandering.shopX This user is from outside of this forum
                            xgranade@wandering.shop
                            wrote last edited by
                            #69

                            @Foxboron It looks like this was the PR?

                            Link Preview Image
                            chardet 7.0: ground-up MIT-licensed rewrite by dan-blanchard · Pull Request #322 · chardet/chardet

                            Python character encoding detector. Contribute to chardet/chardet development by creating an account on GitHub.

                            favicon

                            GitHub (github.com)

                            Even aside from the ethical and moral issues with LLMs, it doesn't seem optimal that a 15k line PR affecting almost a million dependent repos (if GitHub's count is to be believed) was up for three days before getting merged in.

                            foxboron@chaos.socialF 1 Reply Last reply
                            0
                            • xgranade@wandering.shopX This user is from outside of this forum
                              xgranade@wandering.shopX This user is from outside of this forum
                              xgranade@wandering.shop
                              wrote last edited by
                              #70

                              @aud It's at least not systems code, so there's not a lot of potential for buffer overflow and other memory unsafety exploits, but yeah. No. chardet is not a small surface area.

                              aud@fire.asta.lgbtA 1 Reply Last reply
                              0
                              • thomasjwebb@mastodon.socialT thomasjwebb@mastodon.social

                                @Foxboron @scy hol' up... the *output* isn't copyrightable? That would be awesome if they decided that.

                                wordshaper@weatherishappening.networkW This user is from outside of this forum
                                wordshaper@weatherishappening.networkW This user is from outside of this forum
                                wordshaper@weatherishappening.network
                                wrote last edited by
                                #71

                                @thomasjwebb @Foxboron @scy In the US, at least, human authorship is required for copyright, and if you try to copyright something that's a mix of AI and human generated then generally only the human generated part is copyrightable.

                                https://www.congress.gov/crs-product/LSB10922#:~:text=Granting%20that%20human%20authors%20may,applying%20to%20register%20their%20copyright.

                                This is separate from the LLMs emitting text other people have written, so at *best* this code can't be licensed because it's not copyrightable, and at worst its license laundering and there's precedent (IIRC) for stomping on that hard.

                                1 Reply Last reply
                                0
                                • xgranade@wandering.shopX xgranade@wandering.shop

                                  @aud It's at least not systems code, so there's not a lot of potential for buffer overflow and other memory unsafety exploits, but yeah. No. chardet is not a small surface area.

                                  aud@fire.asta.lgbtA This user is from outside of this forum
                                  aud@fire.asta.lgbtA This user is from outside of this forum
                                  aud@fire.asta.lgbt
                                  wrote last edited by
                                  #72

                                  @xgranade@wandering.shop There's just no way that's a good idea. I'm pretty sure a human who tried to push a 15K rewrite into most libraries would be yelled at forever and the PR rejected, or asked to be broken into smaller PRs, because it's just such a large change in one go and no one can possibly fit that entire thing into their head.

                                  It doesn't magically become a good idea just because claude shat it out.

                                  1 Reply Last reply
                                  0
                                  • scy@chaos.socialS scy@chaos.social

                                    @Foxboron Yeah but that's what I mean: Just because the end result is not copyrightable, does that automatically mean that it can't be a copyright violation?

                                    Like, changing the format or medium of something is not a copyrightable work.

                                    So, by that logic, if I take a copyrighted MP3 and convert it to AAC and publish that, my AAC is not copyrightable, but it's not a copyright violation to take it and publish it?

                                    That's what I mean.

                                    jens@social.finkhaeuser.deJ This user is from outside of this forum
                                    jens@social.finkhaeuser.deJ This user is from outside of this forum
                                    jens@social.finkhaeuser.de
                                    wrote last edited by
                                    #73

                                    @scy @Foxboron It's a bit complicated, actually. IANAL, but this is what I understand:

                                    - The music notation is copyrightable, individual notes are not. A sequence of notes is debatable, and it depends highly on recognizability AFAIK.

                                    - A music recording is copyrightable. Playing that music in a distinctly different arrangement, less of an issue.

                                    - Arguably, a change in digital format is either still the same recording, or sufficiently indistinguishable from it.

                                    - Copyright has an ancient...

                                    jens@social.finkhaeuser.deJ 1 Reply Last reply
                                    0
                                    • jens@social.finkhaeuser.deJ jens@social.finkhaeuser.de

                                      @scy @Foxboron It's a bit complicated, actually. IANAL, but this is what I understand:

                                      - The music notation is copyrightable, individual notes are not. A sequence of notes is debatable, and it depends highly on recognizability AFAIK.

                                      - A music recording is copyrightable. Playing that music in a distinctly different arrangement, less of an issue.

                                      - Arguably, a change in digital format is either still the same recording, or sufficiently indistinguishable from it.

                                      - Copyright has an ancient...

                                      jens@social.finkhaeuser.deJ This user is from outside of this forum
                                      jens@social.finkhaeuser.deJ This user is from outside of this forum
                                      jens@social.finkhaeuser.de
                                      wrote last edited by
                                      #74

                                      @scy @Foxboron ... naming and goes back to a time where making copies and distributing them was the hard part.

                                      This is a non-problem in the digital age, which is why it's fine to create backup copies of copyrighted works, so long as the people accessing them are always the people having purchased/licensed an original copy.

                                      So LLMs training on GPL is not itself a copyright violation, and them reproducing similar code isn't either, but then publishing such sufficiently similar code is.

                                      jens@social.finkhaeuser.deJ 1 Reply Last reply
                                      0
                                      • jens@social.finkhaeuser.deJ jens@social.finkhaeuser.de

                                        @scy @Foxboron ... naming and goes back to a time where making copies and distributing them was the hard part.

                                        This is a non-problem in the digital age, which is why it's fine to create backup copies of copyrighted works, so long as the people accessing them are always the people having purchased/licensed an original copy.

                                        So LLMs training on GPL is not itself a copyright violation, and them reproducing similar code isn't either, but then publishing such sufficiently similar code is.

                                        jens@social.finkhaeuser.deJ This user is from outside of this forum
                                        jens@social.finkhaeuser.deJ This user is from outside of this forum
                                        jens@social.finkhaeuser.de
                                        wrote last edited by
                                        #75

                                        @scy @Foxboron TL;DR what others already wrote: if the result is similar enough to inputs, the copyright holder of the inputs could challenge it, yes.

                                        jens@social.finkhaeuser.deJ 1 Reply Last reply
                                        0
                                        • jens@social.finkhaeuser.deJ jens@social.finkhaeuser.de

                                          @scy @Foxboron TL;DR what others already wrote: if the result is similar enough to inputs, the copyright holder of the inputs could challenge it, yes.

                                          jens@social.finkhaeuser.deJ This user is from outside of this forum
                                          jens@social.finkhaeuser.deJ This user is from outside of this forum
                                          jens@social.finkhaeuser.de
                                          wrote last edited by
                                          #76

                                          @scy @Foxboron If courts decide to throw this out, I would personally *love* for someone to use the exact same argument to produce a minimally altered copy of Avatar, and have Hollywood throw a fit.

                                          jens@social.finkhaeuser.deJ 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups