Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it toLLMs: (enable that)Free software people: Oh no not like that

Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it toLLMs: (enable that)Free software people: Oh no not like that

Scheduled Pinned Locked Moved Uncategorized
194 Posts 82 Posters 15 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • ignaloidas@not.acu.ltI ignaloidas@not.acu.lt

    @mnl@hachyderm.io @mjg59@nondeterministic.computer @david_chisnall@infosec.exchange @newhinton@troet.cafe the training objective is not "be correct", so that's not what the models are trained on. They aren't trained on such an objective because there's no way to score it - if you had a system that could determine whether a statement was correct, then you could just use that. No, what the models are trained on are globs of existing text, targeting the continuations to be the same as the text. Notably, most(all?) LLM makers don't even care whether most of the text is "correct" (in any sense sense of the word), and "solve" it by training on some more carefully selected globs of text. And in the end, what the model itself outputs are probabilities of a specific token (not even a sentence or something) to be next. The text you get is all just dice rolls on those probabilities, again and again.

    It is a text prediction machine. A very powerful one, but it's just a prediction. It just picks whatever is likely, with no regard with what is correct

    mnl@hachyderm.ioM This user is from outside of this forum
    mnl@hachyderm.ioM This user is from outside of this forum
    mnl@hachyderm.io
    wrote last edited by
    #150

    @ignaloidas @mjg59 @david_chisnall @newhinton that’s also not how current llms work, there is a significant amount of post-training using RL being done, and that too is a whole field of research.

    Furthermore, current llm-based tools usually do multiple round of inference interspersed with more traditional “tool calls” (or, as I prefer to call it, interpreting sampled tokens in a deterministic/formal manner).

    ignaloidas@not.acu.ltI 1 Reply Last reply
    0
    • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

      Personally I'm not going to literally copy code from a codebase under an incompatible license because that is what the law says, but have I read proprietary code and learned the underlying creative aspect and then written new code that embodies it? Yes! Anyone claiming otherwise is lying!

      mutesplash@uncontrollablegas.comM This user is from outside of this forum
      mutesplash@uncontrollablegas.comM This user is from outside of this forum
      mutesplash@uncontrollablegas.com
      wrote last edited by
      #151

      @mjg59 Learning from and adapting ideas from unlicensed code into new code is an accommodation under law for humans. If you built a machine to do this at scale, however, that's a choice to leverage a humane decision into a profitable hack.

      1 Reply Last reply
      0
      • mnl@hachyderm.ioM mnl@hachyderm.io

        @ced @david_chisnall @mjg59 @ignaloidas which search engine do you use? I use @kagihq and it’s always a pleasure.

        Llms can provide information about sources. If they tell me that Shannon said x in his thesis on p.463 I can look it up. If they tell me that variable foo is on line X in file Y, I can easily verify it. If they think that Z compiles, I don’t even need to cross check that, the computer can do it for me. In fact verifying certain assumptions about code might be the easiest of them all, which is why llms are quite effective at writing code.

        mnl@hachyderm.ioM This user is from outside of this forum
        mnl@hachyderm.ioM This user is from outside of this forum
        mnl@hachyderm.io
        wrote last edited by
        #152

        @ced @david_chisnall @mjg59 @ignaloidas @kagihq to the search engine thing, one reason I think that they’re usually more problematic to use is that there’s actually incentives to make results worse. I switched to Kagi from google/duckduckgo before ChatGPT because the results were already complete trash.

        Sure, I have to pay by the search, but that’s the only business model that at least enables non-gameable results.

        1 Reply Last reply
        0
        • mnl@hachyderm.ioM mnl@hachyderm.io

          @ced @david_chisnall @mjg59 @ignaloidas which search engine do you use? I use @kagihq and it’s always a pleasure.

          Llms can provide information about sources. If they tell me that Shannon said x in his thesis on p.463 I can look it up. If they tell me that variable foo is on line X in file Y, I can easily verify it. If they think that Z compiles, I don’t even need to cross check that, the computer can do it for me. In fact verifying certain assumptions about code might be the easiest of them all, which is why llms are quite effective at writing code.

          ced@mapstodon.spaceC This user is from outside of this forum
          ced@mapstodon.spaceC This user is from outside of this forum
          ced@mapstodon.space
          wrote last edited by
          #153

          @mnl @david_chisnall @mjg59 @ignaloidas @kagihq
          sure, but if I have to check every sentence, because even if 99 of them are correct I can't trust that the 100th will, doesn't it quite defeat the point? If I'm not reading a primary source, I have to be sure that I can trust the synthesis (at least to a point). With LLMs I can't.

          1 Reply Last reply
          0
          • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

            Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it to
            LLMs: (enable that)
            Free software people: Oh no not like that

            tglman@techdon.devT This user is from outside of this forum
            tglman@techdon.devT This user is from outside of this forum
            tglman@techdon.dev
            wrote last edited by
            #154

            @mjg59
            LLMs able to produce software are neither free in cost nor in freedom as today, which would be OK as a temporary step but not as a long term solution, a free LLM where the source data would be free and an individual could retrain it independently could be a solution but as today there is no technical solution aviable for not millionaire individuals

            1 Reply Last reply
            0
            • mnl@hachyderm.ioM mnl@hachyderm.io

              @ignaloidas @mjg59 @david_chisnall @newhinton that’s also not how current llms work, there is a significant amount of post-training using RL being done, and that too is a whole field of research.

              Furthermore, current llm-based tools usually do multiple round of inference interspersed with more traditional “tool calls” (or, as I prefer to call it, interpreting sampled tokens in a deterministic/formal manner).

              ignaloidas@not.acu.ltI This user is from outside of this forum
              ignaloidas@not.acu.ltI This user is from outside of this forum
              ignaloidas@not.acu.lt
              wrote last edited by
              #155

              @mnl@hachyderm.io @mjg59@nondeterministic.computer @david_chisnall@infosec.exchange @newhinton@troet.cafe all of that training is still continuation based because that is what the models predict. Yes, there is a bunch of research, and honestly, most of it is banging head against fundamental issues of the model, but is still being funded because LLMs are at the end of it all, quite useless if they just spit nonsense from time to time and it's indistinguishable from sensible stuff without carefully cross-checking it all.

              Tool calls are just that - tools to add stuff into the context for further prediction, but they in no way do anything to make sure that the LLM output is correct, because once again - everything is treated as a continuation after the tool call, and it's just predicting, what's the most likely thing to do, not what's the correct thing to do.

              mnl@hachyderm.ioM 1 Reply Last reply
              0
              • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

                When I write code I am turning a creative idea into a mechanical embodiment of that idea. I am not creating beauty. Every line of code I write is a copy of another line of code I've read somewhere before, lightly modified to meet my needs. My code is not intended to evoke emotion. It does not change people think about the world. The idea→code pipeline in my head is not obviously distinguishable from the prompt->code process in an LLM

                boydstephensmithjr@hachyderm.ioB This user is from outside of this forum
                boydstephensmithjr@hachyderm.ioB This user is from outside of this forum
                boydstephensmithjr@hachyderm.io
                wrote last edited by
                #156

                @mjg59

                > When I write code I am turning a creative idea into a mechanical embodiment of that idea. I am not creating beauty

                When *I* code, I am creating beauty, or at least trying to.

                I hope each proof/program I write is as close to the proof from "the book" has possible. At a Pareto optimum of simplicity and elegance.

                1 Reply Last reply
                0
                • ignaloidas@not.acu.ltI ignaloidas@not.acu.lt

                  @mnl@hachyderm.io @mjg59@nondeterministic.computer @david_chisnall@infosec.exchange @newhinton@troet.cafe all of that training is still continuation based because that is what the models predict. Yes, there is a bunch of research, and honestly, most of it is banging head against fundamental issues of the model, but is still being funded because LLMs are at the end of it all, quite useless if they just spit nonsense from time to time and it's indistinguishable from sensible stuff without carefully cross-checking it all.

                  Tool calls are just that - tools to add stuff into the context for further prediction, but they in no way do anything to make sure that the LLM output is correct, because once again - everything is treated as a continuation after the tool call, and it's just predicting, what's the most likely thing to do, not what's the correct thing to do.

                  mnl@hachyderm.ioM This user is from outside of this forum
                  mnl@hachyderm.ioM This user is from outside of this forum
                  mnl@hachyderm.io
                  wrote last edited by
                  #157

                  @ignaloidas @mjg59 @david_chisnall @newhinton do you blindly trust code just because it’s been written by a human? Or your own code for that matter? I don’t, and yet I am able to produce hopefully useful software. In fact I have to trust an immense amount of software without verifying it, based on vibes. For llms at least I can benchmark the vibes, or at least more easily gather empirical observations than with humans.

                  ignaloidas@not.acu.ltI 1 Reply Last reply
                  0
                  • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

                    Look, coders, we are not writers. There's no way to turn "increment this variable" into life changing prose. The creativity exists outside the code. It always has done and it always will do. Let it go.

                    bsandro@bsd.networkB This user is from outside of this forum
                    bsandro@bsd.networkB This user is from outside of this forum
                    bsandro@bsd.network
                    wrote last edited by
                    #158

                    @mjg59

                    Pragmatic standpoint is completely valid, but don't forget why do we have writing systems: to convey information. That's the basic need. So taking the same pragmatic approach we don't need writers nor poets nor prose or anything of sorts: language exists to transfer data from human to human, and don't you dare to find any of that serialization into english/anything beautiful. Is that it?

                    1 Reply Last reply
                    0
                    • mnl@hachyderm.ioM mnl@hachyderm.io

                      @ignaloidas @mjg59 @david_chisnall @newhinton do you blindly trust code just because it’s been written by a human? Or your own code for that matter? I don’t, and yet I am able to produce hopefully useful software. In fact I have to trust an immense amount of software without verifying it, based on vibes. For llms at least I can benchmark the vibes, or at least more easily gather empirical observations than with humans.

                      ignaloidas@not.acu.ltI This user is from outside of this forum
                      ignaloidas@not.acu.ltI This user is from outside of this forum
                      ignaloidas@not.acu.lt
                      wrote last edited by
                      #159

                      @mnl@hachyderm.io @mjg59@nondeterministic.computer @david_chisnall@infosec.exchange @newhinton@troet.cafe Not blindly, of course, but I build up trust relationships with people I work with. And I do trust my own code to a certain extent. I can't trust a bunch of dice. The fact that you don't trust your own code at all honestly tells me all I ever need to know about you.

                      mnl@hachyderm.ioM 1 Reply Last reply
                      0
                      • ignaloidas@not.acu.ltI ignaloidas@not.acu.lt

                        @mnl@hachyderm.io @mjg59@nondeterministic.computer @david_chisnall@infosec.exchange @newhinton@troet.cafe Not blindly, of course, but I build up trust relationships with people I work with. And I do trust my own code to a certain extent. I can't trust a bunch of dice. The fact that you don't trust your own code at all honestly tells me all I ever need to know about you.

                        mnl@hachyderm.ioM This user is from outside of this forum
                        mnl@hachyderm.ioM This user is from outside of this forum
                        mnl@hachyderm.io
                        wrote last edited by
                        #160

                        @ignaloidas @mjg59 @david_chisnall @newhinton how did you gain your confidence? How can you call machine learning a bunch of dice? I try to study and build things everyday and yes I don’t trust my code at all, which I think is a healthy attitude to have? I am definitely not able to produce perfect code on the first try.

                        ignaloidas@not.acu.ltI 1 Reply Last reply
                        0
                        • kyle@mastodon.kylerank.inK kyle@mastodon.kylerank.in

                          @mjg59 You will get backlash, but you are right.

                          Free software folks will have to decide whether what they really wanted was *everyone* to have the freedom to use and modify software, or only that subset of everyone who had the privilege of learning software development.

                          There has always been this elitist dividing line in the community between people who contribute code, and people who contribute all the other things FOSS needs to thrive. Now those people can contribute code too.

                          zachdecook@social.librem.oneZ This user is from outside of this forum
                          zachdecook@social.librem.oneZ This user is from outside of this forum
                          zachdecook@social.librem.one
                          wrote last edited by
                          #161

                          @kyle @mjg59 Proprietary tooling is the reason "Stallman was right" about Bitkeeper, but "everyone was better off for having not listened to him" is the pragmatic side.
                          Yes, I want people to benefit from the freedom to modify code, but they will never truly be free if they are using a proprietary LLM to make their modifications.

                          1 Reply Last reply
                          0
                          • mnl@hachyderm.ioM mnl@hachyderm.io

                            @david_chisnall @mjg59 @ignaloidas I have encountered plenty of people and books that were wrong, so I still have to engage my brain and double check, though.

                            engideer@tech.lgbtE This user is from outside of this forum
                            engideer@tech.lgbtE This user is from outside of this forum
                            engideer@tech.lgbt
                            wrote last edited by
                            #162

                            @mnl @david_chisnall @mjg59 @ignaloidas "Because people can be wrong, there's zero difference between asking an expert and a rando about a subject."

                            That's essentially your position. I assume you also support RFK Jr. leading the HHS? After all, medical doctors can be wrong too!

                            mnl@hachyderm.ioM 1 Reply Last reply
                            0
                            • chris_evelyn@fedi.chris-evelyn.deC chris_evelyn@fedi.chris-evelyn.de

                              @mjg59 Yeah, as soon as there‘s an ethically sourced and trained free LLM that‘s not controlled by very shitty companies I‘m totally on board with you.

                              Until then we shouldn’t let that shit near our projects.

                              light@noc.socialL This user is from outside of this forum
                              light@noc.socialL This user is from outside of this forum
                              light@noc.social
                              wrote last edited by
                              #163

                              @chris_evelyn
                              What do you mean by "ethically sourced and trained"?
                              @mjg59

                              chris_evelyn@fedi.chris-evelyn.deC 1 Reply Last reply
                              0
                              • engideer@tech.lgbtE engideer@tech.lgbt

                                @mnl @david_chisnall @mjg59 @ignaloidas "Because people can be wrong, there's zero difference between asking an expert and a rando about a subject."

                                That's essentially your position. I assume you also support RFK Jr. leading the HHS? After all, medical doctors can be wrong too!

                                mnl@hachyderm.ioM This user is from outside of this forum
                                mnl@hachyderm.ioM This user is from outside of this forum
                                mnl@hachyderm.io
                                wrote last edited by
                                #164

                                @engideer @david_chisnall @mjg59 @ignaloidas I don’t think llms are “rando”. They have randomized elements during training and inference, but they’re not a random number generator. I also would trust a “rando” less than an expert in real life. I wouldn’t trust either blindly either.

                                mnl@hachyderm.ioM ignaloidas@not.acu.ltI 2 Replies Last reply
                                0
                                • light@noc.socialL light@noc.social

                                  @chris_evelyn
                                  What do you mean by "ethically sourced and trained"?
                                  @mjg59

                                  chris_evelyn@fedi.chris-evelyn.deC This user is from outside of this forum
                                  chris_evelyn@fedi.chris-evelyn.deC This user is from outside of this forum
                                  chris_evelyn@fedi.chris-evelyn.de
                                  wrote last edited by
                                  #165

                                  @light At minimum that:

                                  • all input material is legit - either public domain or fairly paid for
                                  • all labeling/curating is done under good labor conditions

                                  @mjg59

                                  1 Reply Last reply
                                  0
                                  • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

                                    Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it to
                                    LLMs: (enable that)
                                    Free software people: Oh no not like that

                                    bazkie@beige.partyB This user is from outside of this forum
                                    bazkie@beige.partyB This user is from outside of this forum
                                    bazkie@beige.party
                                    wrote last edited by
                                    #166

                                    @mjg59 LLMs do not enable that at all tho? an LLM enables people to make software behave as they wish similarly to a crowbar enabling people to open a door

                                    1 Reply Last reply
                                    0
                                    • promovicz@chaos.socialP promovicz@chaos.social

                                      @mjg59 What you propose is actually illegal, even if the law doesn’t make much sense. I wonder if you ever had the cops sent after you on a corp-run IP case… maybe it would make you feel different?

                                      light@noc.socialL This user is from outside of this forum
                                      light@noc.socialL This user is from outside of this forum
                                      light@noc.social
                                      wrote last edited by
                                      #167

                                      @promovicz
                                      Let's hope the AI lobby will (in any combination of purposely and inadvertently) make that law obsolete.
                                      @mjg59

                                      1 Reply Last reply
                                      0
                                      • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

                                        Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it to
                                        LLMs: (enable that)
                                        Free software people: Oh no not like that

                                        jordan@mastodon.subj.amJ This user is from outside of this forum
                                        jordan@mastodon.subj.amJ This user is from outside of this forum
                                        jordan@mastodon.subj.am
                                        wrote last edited by
                                        #168

                                        @mjg59 I think the issue is more on the forcing of LLMs/AI in *everything* right now, not specifically F/OSS projects. It reeks of dot-com bubble era marketing and in many cases is completely unnecessary.

                                        1 Reply Last reply
                                        0
                                        • mnl@hachyderm.ioM mnl@hachyderm.io

                                          @engideer @david_chisnall @mjg59 @ignaloidas I don’t think llms are “rando”. They have randomized elements during training and inference, but they’re not a random number generator. I also would trust a “rando” less than an expert in real life. I wouldn’t trust either blindly either.

                                          mnl@hachyderm.ioM This user is from outside of this forum
                                          mnl@hachyderm.ioM This user is from outside of this forum
                                          mnl@hachyderm.io
                                          wrote last edited by
                                          #169

                                          @engideer @david_chisnall @mjg59 @ignaloidas also I didn’t say anything of what you quoted, and I don’t know where you got it from.

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups