Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Yesterday, I boosted something that maybe I shouldn't have, which made at least one OSS maintainer feel personally attacked.

Yesterday, I boosted something that maybe I shouldn't have, which made at least one OSS maintainer feel personally attacked.

Scheduled Pinned Locked Moved Uncategorized
28 Posts 7 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • glyph@mastodon.socialG glyph@mastodon.social

    Yesterday, I boosted something that maybe I shouldn't have, which made at least one OSS maintainer feel personally attacked. I want to make it clear in case anyone else felt the same:

    I think that LLM policies should be short & clear: "Do not use LLMs when working on this project. Don't submit code or comments using them."

    I am not angry with people struggling to navigate this topic differently than I am.

    aeva@mastodon.gamedev.placeA This user is from outside of this forum
    aeva@mastodon.gamedev.placeA This user is from outside of this forum
    aeva@mastodon.gamedev.place
    wrote last edited by
    #10

    @glyph I'm torn between wanting to put up a sign that says "If you do this I will hurt your feelings, block you, and report your account as spam" and knowing that stuff like that invites adversarial behavior from people and also I'd rather just start with something to the effect of "remember the golden rule: do not make me have to make new rules" and only add rules to projects as needed (I don't think I've ever gotten an unsolicited pull request from a stranger on any of my projects yet)

    glyph@mastodon.socialG 1 Reply Last reply
    0
    • glyph@mastodon.socialG glyph@mastodon.social

      @miss_rodent I didn't even really see it as criticizing their specific choices, but if you've sweated blood to write a policy which attempts to deal with a coalition of very angry and diametrically opposed factions I can understand that if somebody calls one of the choices you made in that process stupid, you're going to have a strong reaction.

      miss_rodent@girlcock.clubM This user is from outside of this forum
      miss_rodent@girlcock.clubM This user is from outside of this forum
      miss_rodent@girlcock.club
      wrote last edited by
      #11

      @glyph Yeah, it makes sense. I'm def on your side of thinking the correct policy is just "No". Even if you don't care about the code quality or maintainability concerns, and don't care about the environment being burned down over it, and don't care about the plagarism, the potential legal/license concerns alone make it seem like a bad idea to allow uncopyrightable code into a project that depends on protections from licensing the copyright.

      miss_rodent@girlcock.clubM glyph@mastodon.socialG 2 Replies Last reply
      0
      • aeva@mastodon.gamedev.placeA aeva@mastodon.gamedev.place

        @glyph I'm torn between wanting to put up a sign that says "If you do this I will hurt your feelings, block you, and report your account as spam" and knowing that stuff like that invites adversarial behavior from people and also I'd rather just start with something to the effect of "remember the golden rule: do not make me have to make new rules" and only add rules to projects as needed (I don't think I've ever gotten an unsolicited pull request from a stranger on any of my projects yet)

        glyph@mastodon.socialG This user is from outside of this forum
        glyph@mastodon.socialG This user is from outside of this forum
        glyph@mastodon.social
        wrote last edited by
        #12

        @aeva one irony here is that most of MY projects do not have any such policy in place because I am (for whatever reason) just not on the receiving end of much spam; the offended party here has written what I would otherwise consider very good and detailed policies and thus done more work than I have. Perhaps people just know my feelings from external channels such as this one.

        snoopj@hachyderm.ioS 1 Reply Last reply
        0
        • glyph@mastodon.socialG glyph@mastodon.social

          @aeva one irony here is that most of MY projects do not have any such policy in place because I am (for whatever reason) just not on the receiving end of much spam; the offended party here has written what I would otherwise consider very good and detailed policies and thus done more work than I have. Perhaps people just know my feelings from external channels such as this one.

          snoopj@hachyderm.ioS This user is from outside of this forum
          snoopj@hachyderm.ioS This user is from outside of this forum
          snoopj@hachyderm.io
          wrote last edited by
          #13

          @glyph @aeva I would guess that the bar for "having an idea of what might be even a hypothetically useful change" is higher in your projects than the average, filtering out a lot of people who aren't going to bother to understand them in the first place.

          (okay, I'm mostly thinking about Twisted, but)

          glyph@mastodon.socialG 1 Reply Last reply
          0
          • glyph@mastodon.socialG glyph@mastodon.social

            @Rataunderground To rephrase the thing I boosted but without calling anyone names this time: a policy can acknowledge the limits of its own enforceability without allowing for those violations.

            For example, even the most lawyer-tested submission-licensing policy has language like "By submitting this code, you certify that you have all the relevant rights to license it to the project under our terms." There's never been (and can never be) a way to enforce that mechanically.

            glyph@mastodon.socialG This user is from outside of this forum
            glyph@mastodon.socialG This user is from outside of this forum
            glyph@mastodon.social
            wrote last edited by
            #14

            @Rataunderground A policy could simply require similar self-certification that they haven't used any LLM tools in the process. It is simultaneously true that:

            1. There's a large swathe of coding work that is rote and tedious and could probably be done with LLM tools totally undetectably if someone were motivated to violate this policy.

            2. The sort of person who would actively seek to do that would almost certainly, eventually, leave *very* obvious evidence of their violations.

            1 Reply Last reply
            0
            • snoopj@hachyderm.ioS snoopj@hachyderm.io

              @glyph @aeva I would guess that the bar for "having an idea of what might be even a hypothetically useful change" is higher in your projects than the average, filtering out a lot of people who aren't going to bother to understand them in the first place.

              (okay, I'm mostly thinking about Twisted, but)

              glyph@mastodon.socialG This user is from outside of this forum
              glyph@mastodon.socialG This user is from outside of this forum
              glyph@mastodon.social
              wrote last edited by
              #15

              @SnoopJ @aeva there's also, like, a popularity threshold thing. Twisted is close to the top of the popularity pile for the stuff I maintain and it's already relatively obscure. People standing directly in the "core infrastructure" line of fire have to deal with a lot more inbound even in the Before Times, let alone now

              snoopj@hachyderm.ioS 1 Reply Last reply
              0
              • glyph@mastodon.socialG glyph@mastodon.social

                @SnoopJ @aeva there's also, like, a popularity threshold thing. Twisted is close to the top of the popularity pile for the stuff I maintain and it's already relatively obscure. People standing directly in the "core infrastructure" line of fire have to deal with a lot more inbound even in the Before Times, let alone now

                snoopj@hachyderm.ioS This user is from outside of this forum
                snoopj@hachyderm.ioS This user is from outside of this forum
                snoopj@hachyderm.io
                wrote last edited by
                #16

                @glyph @aeva yea, very true as well

                1 Reply Last reply
                0
                • miss_rodent@girlcock.clubM miss_rodent@girlcock.club

                  @glyph Yeah, it makes sense. I'm def on your side of thinking the correct policy is just "No". Even if you don't care about the code quality or maintainability concerns, and don't care about the environment being burned down over it, and don't care about the plagarism, the potential legal/license concerns alone make it seem like a bad idea to allow uncopyrightable code into a project that depends on protections from licensing the copyright.

                  miss_rodent@girlcock.clubM This user is from outside of this forum
                  miss_rodent@girlcock.clubM This user is from outside of this forum
                  miss_rodent@girlcock.club
                  wrote last edited by
                  #17

                  @glyph (not that a policy of "no" can stop all of it -- obviously plagarism and license violations are already illegal and already got into projects and cause problems.
                  But at least a policy of "No" means a lot fewer LLM things will be submitted, you can point to the policy to reject or remove the ones that get through, to blacklist violators of that policy from further contributions, etc. etc. - it gives you a filter on one side, and options to handle policy violations on the other.)

                  1 Reply Last reply
                  0
                  • miss_rodent@girlcock.clubM miss_rodent@girlcock.club

                    @glyph Yeah, it makes sense. I'm def on your side of thinking the correct policy is just "No". Even if you don't care about the code quality or maintainability concerns, and don't care about the environment being burned down over it, and don't care about the plagarism, the potential legal/license concerns alone make it seem like a bad idea to allow uncopyrightable code into a project that depends on protections from licensing the copyright.

                    glyph@mastodon.socialG This user is from outside of this forum
                    glyph@mastodon.socialG This user is from outside of this forum
                    glyph@mastodon.social
                    wrote last edited by
                    #18

                    @miss_rodent I tend to think that the copyright concerns are both
                    A) real, and
                    B) overblown.

                    There's a fair amount of case law at this point that indicates that you can mix in a small amount of human creativity to create something copyrightable. In the context of coding, particularly of open source, the "raw" LLM outputs are generally not even accessible, given that a bunch of human creative choices go into what to submit, and the project as a whole has an umbrella of human creativity generally

                    glyph@mastodon.socialG miss_rodent@girlcock.clubM 2 Replies Last reply
                    0
                    • glyph@mastodon.socialG glyph@mastodon.social

                      @miss_rodent I tend to think that the copyright concerns are both
                      A) real, and
                      B) overblown.

                      There's a fair amount of case law at this point that indicates that you can mix in a small amount of human creativity to create something copyrightable. In the context of coding, particularly of open source, the "raw" LLM outputs are generally not even accessible, given that a bunch of human creative choices go into what to submit, and the project as a whole has an umbrella of human creativity generally

                      glyph@mastodon.socialG This user is from outside of this forum
                      glyph@mastodon.socialG This user is from outside of this forum
                      glyph@mastodon.social
                      wrote last edited by
                      #19

                      @miss_rodent But, to take the recent example of chardet:

                      While I do not like Bruce Perens's opinion on the copyright status of v7, I think he is *probably* correct on the merits under current jurisprudence. The "clean room" implementation is *probably* going to be ruled either uncopyrightable (public domain) or copyrighted (MIT license) by the "new author" by dint of a few minor choices around its submission and structure.

                      HOWEVER…

                      glyph@mastodon.socialG 1 Reply Last reply
                      0
                      • glyph@mastodon.socialG glyph@mastodon.social

                        @miss_rodent I tend to think that the copyright concerns are both
                        A) real, and
                        B) overblown.

                        There's a fair amount of case law at this point that indicates that you can mix in a small amount of human creativity to create something copyrightable. In the context of coding, particularly of open source, the "raw" LLM outputs are generally not even accessible, given that a bunch of human creative choices go into what to submit, and the project as a whole has an umbrella of human creativity generally

                        miss_rodent@girlcock.clubM This user is from outside of this forum
                        miss_rodent@girlcock.clubM This user is from outside of this forum
                        miss_rodent@girlcock.club
                        wrote last edited by
                        #20

                        @glyph I think the more pressing concern is their tendancy to produce exact or near-exact copies of training code (I linked a study a few days ago about it ... here https://girlcock.club/@miss_rodent/116190673809741664 ) which ... it's kind of an open question still, how much of a legal liability it is, but, there is enough uncertainty about it that if I were running a donation-funded project, I wouldn't want to risk the legal fees over it until someone with money sets stronger precedents about it.

                        miss_rodent@girlcock.clubM 1 Reply Last reply
                        0
                        • glyph@mastodon.socialG glyph@mastodon.social

                          @miss_rodent But, to take the recent example of chardet:

                          While I do not like Bruce Perens's opinion on the copyright status of v7, I think he is *probably* correct on the merits under current jurisprudence. The "clean room" implementation is *probably* going to be ruled either uncopyrightable (public domain) or copyrighted (MIT license) by the "new author" by dint of a few minor choices around its submission and structure.

                          HOWEVER…

                          glyph@mastodon.socialG This user is from outside of this forum
                          glyph@mastodon.socialG This user is from outside of this forum
                          glyph@mastodon.social
                          wrote last edited by
                          #21

                          @miss_rodent

                          1. That's only in the US, nad

                          2. In this case, we already have a motivated and highly pissed-off litigant in the form of Mark Pilgrim who broke a *decade-long silence* to tell people that if they FA they might FO. A years-long lawsuit where you *win* is still a pretty catastrophic form of "finding out" in this case. There's still a pretty good chance, even if <50%, that he wins if he decides to sue, and even if he doesn't, you don't want to be standing in the blast radius

                          glyph@mastodon.socialG 1 Reply Last reply
                          0
                          • glyph@mastodon.socialG glyph@mastodon.social

                            @miss_rodent

                            1. That's only in the US, nad

                            2. In this case, we already have a motivated and highly pissed-off litigant in the form of Mark Pilgrim who broke a *decade-long silence* to tell people that if they FA they might FO. A years-long lawsuit where you *win* is still a pretty catastrophic form of "finding out" in this case. There's still a pretty good chance, even if <50%, that he wins if he decides to sue, and even if he doesn't, you don't want to be standing in the blast radius

                            glyph@mastodon.socialG This user is from outside of this forum
                            glyph@mastodon.socialG This user is from outside of this forum
                            glyph@mastodon.social
                            wrote last edited by
                            #22

                            @miss_rodent I am a graduate of the University of Debian-Legal myself (go fightin' sea-lions!), so I realize it is from my tenuous perch on the parapet of a glass house that I am hurling this particular stone, but what a lot of open source programmer / amateur legal analysts get wrong is that the MAIN risk of any copyright issue is the presence of a MOTIVATED COUNTERPARTY WITH A CAUSE OF ACTION, way more than any specific legal risk that you might be able to anticipate

                            glyph@mastodon.socialG 1 Reply Last reply
                            0
                            • glyph@mastodon.socialG glyph@mastodon.social

                              @miss_rodent I am a graduate of the University of Debian-Legal myself (go fightin' sea-lions!), so I realize it is from my tenuous perch on the parapet of a glass house that I am hurling this particular stone, but what a lot of open source programmer / amateur legal analysts get wrong is that the MAIN risk of any copyright issue is the presence of a MOTIVATED COUNTERPARTY WITH A CAUSE OF ACTION, way more than any specific legal risk that you might be able to anticipate

                              glyph@mastodon.socialG This user is from outside of this forum
                              glyph@mastodon.socialG This user is from outside of this forum
                              glyph@mastodon.social
                              wrote last edited by
                              #23

                              @miss_rodent maybe Pilgrim sues in the Chardet case and absolutely eats dirt on the copyright claims, but it turns out he has some kind of uno-reverse trademark thing that makes them liable for that. Or a patent. There are a zillion ways that such fight could go sideways for unpredictable reasons.

                              1 Reply Last reply
                              0
                              • miss_rodent@girlcock.clubM miss_rodent@girlcock.club

                                @glyph I think the more pressing concern is their tendancy to produce exact or near-exact copies of training code (I linked a study a few days ago about it ... here https://girlcock.club/@miss_rodent/116190673809741664 ) which ... it's kind of an open question still, how much of a legal liability it is, but, there is enough uncertainty about it that if I were running a donation-funded project, I wouldn't want to risk the legal fees over it until someone with money sets stronger precedents about it.

                                miss_rodent@girlcock.clubM This user is from outside of this forum
                                miss_rodent@girlcock.clubM This user is from outside of this forum
                                miss_rodent@girlcock.club
                                wrote last edited by
                                #24

                                @glyph mitigatable by going over it, making changes by hand, involving a human? Sure, probably.
                                ... How many LLM slop code submissions are actually doing that sufficiently to legally qualify or copyright & as distinct from the original? ... where is the bar for that even going to land? ... per country?
                                Seem to be up in the air still.

                                miss_rodent@girlcock.clubM 1 Reply Last reply
                                0
                                • miss_rodent@girlcock.clubM miss_rodent@girlcock.club

                                  @glyph mitigatable by going over it, making changes by hand, involving a human? Sure, probably.
                                  ... How many LLM slop code submissions are actually doing that sufficiently to legally qualify or copyright & as distinct from the original? ... where is the bar for that even going to land? ... per country?
                                  Seem to be up in the air still.

                                  miss_rodent@girlcock.clubM This user is from outside of this forum
                                  miss_rodent@girlcock.clubM This user is from outside of this forum
                                  miss_rodent@girlcock.club
                                  wrote last edited by
                                  #25

                                  @glyph And even if you can win the legal claim... how expensive is it going to be to win it, compared to the cost of just doing the code by hand in the already-established methodology in the first place?

                                  miss_rodent@girlcock.clubM 1 Reply Last reply
                                  0
                                  • miss_rodent@girlcock.clubM miss_rodent@girlcock.club

                                    @glyph And even if you can win the legal claim... how expensive is it going to be to win it, compared to the cost of just doing the code by hand in the already-established methodology in the first place?

                                    miss_rodent@girlcock.clubM This user is from outside of this forum
                                    miss_rodent@girlcock.clubM This user is from outside of this forum
                                    miss_rodent@girlcock.club
                                    wrote last edited by
                                    #26

                                    @glyph Like, even if it doesn't win, if the FSF decides to open GPL cases against LLM-generated code (hypothetically), how many will just settle, or remove the code from a cease-and-desist or something like that, vs actually deal with the legal battle over it to find out?
                                    I haven't seen any indication the FSF would do this, exactly, but, they're an activist group holding rights on a lot of GPL software, it's not entirely out of the question that they might try it.

                                    miss_rodent@girlcock.clubM 1 Reply Last reply
                                    0
                                    • miss_rodent@girlcock.clubM miss_rodent@girlcock.club

                                      @glyph Like, even if it doesn't win, if the FSF decides to open GPL cases against LLM-generated code (hypothetically), how many will just settle, or remove the code from a cease-and-desist or something like that, vs actually deal with the legal battle over it to find out?
                                      I haven't seen any indication the FSF would do this, exactly, but, they're an activist group holding rights on a lot of GPL software, it's not entirely out of the question that they might try it.

                                      miss_rodent@girlcock.clubM This user is from outside of this forum
                                      miss_rodent@girlcock.clubM This user is from outside of this forum
                                      miss_rodent@girlcock.club
                                      wrote last edited by
                                      #27

                                      @glyph (I somewhat doubt they will, but, who knows. If they think they can put a strong case together I could see them at least going after non-free projects over it.)

                                      1 Reply Last reply
                                      0
                                      • miss_rodent@girlcock.clubM miss_rodent@girlcock.club

                                        @glyph Honestly, I'm surprised more projects haven't banned it over just the licensing concerns at this point.
                                        No copyright -> LLM generated code not eligible for licensing, unless your project is public domain/public domain-equivalent licensed, then LLM code creates a looming legal clusterfuck. And LLMs trained on GPL code might be *generating* license violations (and legal liability) any time you run them.

                                        moses_izumi@fe.disroot.orgM This user is from outside of this forum
                                        moses_izumi@fe.disroot.orgM This user is from outside of this forum
                                        moses_izumi@fe.disroot.org
                                        wrote last edited by
                                        #28
                                        @miss_rodent @glyph
                                        The long-term implications of outsourcing software development to a proprietary-ass database (roleplaying as a bullshitting chatbot) are honestly dire enough to overshadow the ethical concerns of how said database was compiled.

                                        (I don't give a crap about machinelearning or it's uses (beyond OCR maybe), but feel free to bring up the more open models)
                                        1 Reply Last reply
                                        1
                                        0
                                        • R relay@relay.mycrowd.ca shared this topic
                                        Reply
                                        • Reply as topic
                                        Log in to reply
                                        • Oldest to Newest
                                        • Newest to Oldest
                                        • Most Votes


                                        • Login

                                        • Login or register to search.
                                        • First post
                                          Last post
                                        0
                                        • Categories
                                        • Recent
                                        • Tags
                                        • Popular
                                        • World
                                        • Users
                                        • Groups