Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Hypothesis: If the output of LLMs cannot be copyrighted, anything in the training set becomes public domain.

Hypothesis: If the output of LLMs cannot be copyrighted, anything in the training set becomes public domain.

Scheduled Pinned Locked Moved Uncategorized
9 Posts 6 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • shapr@recurse.socialS This user is from outside of this forum
    shapr@recurse.socialS This user is from outside of this forum
    shapr@recurse.social
    wrote last edited by
    #1

    Hypothesis: If the output of LLMs cannot be copyrighted, anything in the training set becomes public domain.

    If the recent attempt to LLM-rewrite chardet [1] holds up, then any copyrighted material can be laundered through an LLM.

    Any inputs sent to an LLM for any purpose become part of the training set.

    Thus, any company using LLMs has put their source code in the public domain.

    But you also de-copyright research papers, the windows source code, and fintech?

    [1] https://lucumr.pocoo.org/2026/3/5/theseus/

    ramin_hal9001@fe.disroot.orgR cceckman@hachyderm.ioC dabeaz@mastodon.socialD opticron@eat.fruits.socialO 4 Replies Last reply
    0
    • shapr@recurse.socialS shapr@recurse.social

      Hypothesis: If the output of LLMs cannot be copyrighted, anything in the training set becomes public domain.

      If the recent attempt to LLM-rewrite chardet [1] holds up, then any copyrighted material can be laundered through an LLM.

      Any inputs sent to an LLM for any purpose become part of the training set.

      Thus, any company using LLMs has put their source code in the public domain.

      But you also de-copyright research papers, the windows source code, and fintech?

      [1] https://lucumr.pocoo.org/2026/3/5/theseus/

      ramin_hal9001@fe.disroot.orgR This user is from outside of this forum
      ramin_hal9001@fe.disroot.orgR This user is from outside of this forum
      ramin_hal9001@fe.disroot.org
      wrote last edited by
      #2

      @jhlagado@jorts.horse here is another example of what you were just talking about, how LLM output has no owner, so can't be copyright protected or licensed.

      @shapr@recurse.social

      jhlagado@jorts.horseJ 1 Reply Last reply
      1
      0
      • R relay@relay.mycrowd.ca shared this topic
      • shapr@recurse.socialS shapr@recurse.social

        Hypothesis: If the output of LLMs cannot be copyrighted, anything in the training set becomes public domain.

        If the recent attempt to LLM-rewrite chardet [1] holds up, then any copyrighted material can be laundered through an LLM.

        Any inputs sent to an LLM for any purpose become part of the training set.

        Thus, any company using LLMs has put their source code in the public domain.

        But you also de-copyright research papers, the windows source code, and fintech?

        [1] https://lucumr.pocoo.org/2026/3/5/theseus/

        cceckman@hachyderm.ioC This user is from outside of this forum
        cceckman@hachyderm.ioC This user is from outside of this forum
        cceckman@hachyderm.io
        wrote last edited by
        #3

        @shapr not how it works. https://ansuz.sooke.bc.ca/entry/23 has a similar example:

        shapr@recurse.socialS cceckman@hachyderm.ioC 2 Replies Last reply
        0
        • shapr@recurse.socialS shapr@recurse.social

          Hypothesis: If the output of LLMs cannot be copyrighted, anything in the training set becomes public domain.

          If the recent attempt to LLM-rewrite chardet [1] holds up, then any copyrighted material can be laundered through an LLM.

          Any inputs sent to an LLM for any purpose become part of the training set.

          Thus, any company using LLMs has put their source code in the public domain.

          But you also de-copyright research papers, the windows source code, and fintech?

          [1] https://lucumr.pocoo.org/2026/3/5/theseus/

          dabeaz@mastodon.socialD This user is from outside of this forum
          dabeaz@mastodon.socialD This user is from outside of this forum
          dabeaz@mastodon.social
          wrote last edited by
          #4

          @shapr Saw a discussion about this recently with respect to surveillance tech. For instance, if you asked an LLM to create something similar, would it launder some facsimilie of the actual code?

          shapr@recurse.socialS 1 Reply Last reply
          0
          • cceckman@hachyderm.ioC cceckman@hachyderm.io

            @shapr not how it works. https://ansuz.sooke.bc.ca/entry/23 has a similar example:

            shapr@recurse.socialS This user is from outside of this forum
            shapr@recurse.socialS This user is from outside of this forum
            shapr@recurse.social
            wrote last edited by
            #5

            @cceckman I'll read this, thanks for the link

            1 Reply Last reply
            0
            • dabeaz@mastodon.socialD dabeaz@mastodon.social

              @shapr Saw a discussion about this recently with respect to surveillance tech. For instance, if you asked an LLM to create something similar, would it launder some facsimilie of the actual code?

              shapr@recurse.socialS This user is from outside of this forum
              shapr@recurse.socialS This user is from outside of this forum
              shapr@recurse.social
              wrote last edited by
              #6

              @dabeaz I had the idea to run tree-sitter on outputs like the "claude C compiler" and check the similarity at the AST level, but probably won't actually do this.

              1 Reply Last reply
              0
              • cceckman@hachyderm.ioC cceckman@hachyderm.io

                @shapr not how it works. https://ansuz.sooke.bc.ca/entry/23 has a similar example:

                cceckman@hachyderm.ioC This user is from outside of this forum
                cceckman@hachyderm.ioC This user is from outside of this forum
                cceckman@hachyderm.io
                wrote last edited by
                #7

                @shapr also the charset case, according to this article, appears to be: "the LLM-generated thing may be a derived work of the original". In my understanding,* the fact that the LLM-derived thing may not be copyrightable is irrelevant; it can still infringe.

                If I make an audiobook of _Demon Queen_, and say "release it into the public domain"...that doesn't make it so, the work still infringes.

                * (I am not a lawyer, this is not legal advice)

                1 Reply Last reply
                0
                • shapr@recurse.socialS shapr@recurse.social

                  Hypothesis: If the output of LLMs cannot be copyrighted, anything in the training set becomes public domain.

                  If the recent attempt to LLM-rewrite chardet [1] holds up, then any copyrighted material can be laundered through an LLM.

                  Any inputs sent to an LLM for any purpose become part of the training set.

                  Thus, any company using LLMs has put their source code in the public domain.

                  But you also de-copyright research papers, the windows source code, and fintech?

                  [1] https://lucumr.pocoo.org/2026/3/5/theseus/

                  opticron@eat.fruits.socialO This user is from outside of this forum
                  opticron@eat.fruits.socialO This user is from outside of this forum
                  opticron@eat.fruits.social
                  wrote last edited by
                  #8
                  @shapr "If the output of LLMs cannot be copyrighted" has load-bearing implications here, namely "and it's perfectly legal to rights-wash literally anything". That implication is unlikely and has yet to be proven. On top of that, rights-washing anything makes it unusable as training data due to LLM backfeeding problems.
                  1 Reply Last reply
                  0
                  • ramin_hal9001@fe.disroot.orgR ramin_hal9001@fe.disroot.org

                    @jhlagado@jorts.horse here is another example of what you were just talking about, how LLM output has no owner, so can't be copyright protected or licensed.

                    @shapr@recurse.social

                    jhlagado@jorts.horseJ This user is from outside of this forum
                    jhlagado@jorts.horseJ This user is from outside of this forum
                    jhlagado@jorts.horse
                    wrote last edited by
                    #9

                    @ramin_hal9001 @shapr

                    Yes it is definitely public domain and therefore has no owner.

                    I suppose if you say take this program as your blueprint and let it vibe copy a clone then you've effectively converted the licence from restrictive to public domain.

                    This breaks even the most permissive open source licence.

                    1 Reply Last reply
                    0
                    Reply
                    • Reply as topic
                    Log in to reply
                    • Oldest to Newest
                    • Newest to Oldest
                    • Most Votes


                    • Login

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • World
                    • Users
                    • Groups