Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. now that i am... writing my own agentic LLM framework thing... because if you're going to have a shitposting IRC bot you may as well go completely overkill, i have Opinions on the state of the world.

now that i am... writing my own agentic LLM framework thing... because if you're going to have a shitposting IRC bot you may as well go completely overkill, i have Opinions on the state of the world.

Scheduled Pinned Locked Moved Uncategorized
51 Posts 18 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • mirth@mastodon.sdf.orgM mirth@mastodon.sdf.org

    @ariadne I should say by "catch up" I mean to get to parity, my impression is the model research is kind of like drug development where a lot of the cost is paying for all the experiments that don't work, as a result it's much easier to catch up than to get out "ahead" whatever that means. Setting aside the ethical issues, the functional issue of how to effectively use plausible-sounding crap generators as part of reliable software systems remains unsolved.

    P This user is from outside of this forum
    P This user is from outside of this forum
    pinskia@hachyderm.io
    wrote last edited by
    #39

    @mirth @ariadne This here explains why the US companies are so upset with China here.

    ariadne@social.treehouse.systemsA 1 Reply Last reply
    0
    • P pinskia@hachyderm.io

      @mirth @ariadne This here explains why the US companies are so upset with China here.

      ariadne@social.treehouse.systemsA This user is from outside of this forum
      ariadne@social.treehouse.systemsA This user is from outside of this forum
      ariadne@social.treehouse.systems
      wrote last edited by
      #40

      @pinskia @mirth yep they broke the illusion.

      IMO the real reason OpenAI reserved all of this RAM and shit is to prevent competitors from buying it

      jannem@fosstodon.orgJ 1 Reply Last reply
      0
      • ariadne@social.treehouse.systemsA ariadne@social.treehouse.systems

        @pinskia @mirth yep they broke the illusion.

        IMO the real reason OpenAI reserved all of this RAM and shit is to prevent competitors from buying it

        jannem@fosstodon.orgJ This user is from outside of this forum
        jannem@fosstodon.orgJ This user is from outside of this forum
        jannem@fosstodon.org
        wrote last edited by
        #41

        @ariadne @pinskia @mirth
        What they are doing is forcing competitors to do more with less. Smaller models with a clever architecture, not huge monoliths trained by brute force. Might come back to bite them sooner or later.

        I'd like to see more hybrid models, where the LLM largely sticks to being the language module, and other models (possibly not even NN) specialize in other functions.

        ariadne@social.treehouse.systemsA 1 Reply Last reply
        0
        • jannem@fosstodon.orgJ jannem@fosstodon.org

          @ariadne @pinskia @mirth
          What they are doing is forcing competitors to do more with less. Smaller models with a clever architecture, not huge monoliths trained by brute force. Might come back to bite them sooner or later.

          I'd like to see more hybrid models, where the LLM largely sticks to being the language module, and other models (possibly not even NN) specialize in other functions.

          ariadne@social.treehouse.systemsA This user is from outside of this forum
          ariadne@social.treehouse.systemsA This user is from outside of this forum
          ariadne@social.treehouse.systems
          wrote last edited by
          #42

          @jannem @pinskia @mirth yes, this is what i eventually want to build. a set of libre building blocks for building ethical, libre and personal agentic systems that are self-contained.

          the shit Big AI is doing is not interesting to me, but SLMs and other specialized neural models legitimately provide a useful set of tools to have in the toolbox.

          today, however, I just want to prove the ideas out by shitposting in IRC 😉

          ariadne@social.treehouse.systemsA 1 Reply Last reply
          0
          • ariadne@social.treehouse.systemsA ariadne@social.treehouse.systems

            @jannem @pinskia @mirth yes, this is what i eventually want to build. a set of libre building blocks for building ethical, libre and personal agentic systems that are self-contained.

            the shit Big AI is doing is not interesting to me, but SLMs and other specialized neural models legitimately provide a useful set of tools to have in the toolbox.

            today, however, I just want to prove the ideas out by shitposting in IRC 😉

            ariadne@social.treehouse.systemsA This user is from outside of this forum
            ariadne@social.treehouse.systemsA This user is from outside of this forum
            ariadne@social.treehouse.systems
            wrote last edited by
            #43

            @jannem @pinskia @mirth that said, i think that OpenAI and other hardware/resource hoarders need to be called out on the fact that they don't need all of this to ship product

            there really is no need to destroy the climate or make professional GPUs cost as much as a recent vintage used car

            1 Reply Last reply
            0
            • ariadne@social.treehouse.systemsA ariadne@social.treehouse.systems

              @mirth i mean, i don't think that necessarily holds *if* you have the ability to build whatever you need with legos.

              in many cases simply translating natural language to a specification for an expert system is enough

              pixx@merveilles.townP This user is from outside of this forum
              pixx@merveilles.townP This user is from outside of this forum
              pixx@merveilles.town
              wrote last edited by
              #44

              @ariadne
              Yeah, one thing I've wondered is how much simpler a system that, instead of processing code, took the plain english "refactor this to blah blah" and just processed the language and figured out what to tell the IDE and etc for everything else, could be.

              Run a calculator instead of being one - and you have a much simpler problem to solve.

              Could the reliability and ethical problems all be solved -- maybe, i dunno, but - yet another case of "tech could be cool if the harmful parts go away..."

              @mirth

              ariadne@social.treehouse.systemsA 1 Reply Last reply
              0
              • pixx@merveilles.townP pixx@merveilles.town

                @ariadne
                Yeah, one thing I've wondered is how much simpler a system that, instead of processing code, took the plain english "refactor this to blah blah" and just processed the language and figured out what to tell the IDE and etc for everything else, could be.

                Run a calculator instead of being one - and you have a much simpler problem to solve.

                Could the reliability and ethical problems all be solved -- maybe, i dunno, but - yet another case of "tech could be cool if the harmful parts go away..."

                @mirth

                ariadne@social.treehouse.systemsA This user is from outside of this forum
                ariadne@social.treehouse.systemsA This user is from outside of this forum
                ariadne@social.treehouse.systems
                wrote last edited by
                #45

                @pixx @mirth i think small LLMs do not really have an ethical problem: i trained a 1.3B parameter LLM off of my own personal data in my apartment by simply being patient enough to wait. no copyright violations, no boiling oceans, just patience and a professional workstation GPU with 96GB RAM.

                the ethical problem is with the Big AI companies who feel that the only path forward is to make bigger and bigger and bigger monolithic prediction models rather than properly engineer the damn thing.

                that same ethical problem is driving the hoarding, because companies are buying the hardware to prevent their competitors from having it IMO.

                pixx@merveilles.townP 2 Replies Last reply
                0
                • ariadne@social.treehouse.systemsA ariadne@social.treehouse.systems

                  first of all, when i began i was quite skeptical on commercial AI.

                  this exercise has only made me more skeptical, for a few reasons:

                  first: you actually can hit the "good enough" point for text prediction with very little data. 80GB of low-quality (but ethically sourced from $HOME/logs) training data yielded a bot that can compose english and french prose reasonably well. if i additionally trained it on a creative commons licensed source like a wikipedia dump, it would probably be *way* more than enough. i don't have the compute power to do that though.

                  second: reasoning models seem to largely be "mixture of experts" which are just more LLMs bolted on to each other. there's some cool consensus stuff going on, but that's all there is. this could possibly be considered a form of "thinking" in the framing of minsky's society of mind, but i don't think there is enough here that i would want to invest in companies doing this long term.

                  third: from my own experiences teaching my LLM how to use tools, i can tell you that claude code and openai codex are just chatbots with a really well-written system prompt backed by a "mixture of experts" model. it is like that one scene where neo unlocks god mode in the matrix, i see how all this bullshit works now. (there is still a lot i do not know about the specifics, but i'm a person who works on the fuzzy side of things so it does not matter).

                  fourth: i built my own LLM with a threadripper, some IRC logs gathered from various hard drives, a $10k GPU, a look at the qwen3 training scripts (i have Opinions on py3-transformers) and few days of training. it is pretty capable of generating plausible text. what is the big intellectual property asset that OpenAI has that the little guys can't duplicate? if i can do it in my condo, a startup can certainly compete with OpenAI.

                  given these things, I really just don't understand how it is justifiable for all of this AI stuff to be some double-digit % of global GDP.

                  if anything, i just have stronger conviction in that now.

                  goakam@mastodon.socialG This user is from outside of this forum
                  goakam@mastodon.socialG This user is from outside of this forum
                  goakam@mastodon.social
                  wrote last edited by
                  #46

                  ngl this matches what ive seen running small ops. the hype is way disconnected from whats actually useful day to day. the real value isnt some magic in the model, its finding what problem it actually solves for your specific situation. most companies just buying in because theyre afraid of missing out.

                  1 Reply Last reply
                  0
                  • ariadne@social.treehouse.systemsA ariadne@social.treehouse.systems

                    first of all, when i began i was quite skeptical on commercial AI.

                    this exercise has only made me more skeptical, for a few reasons:

                    first: you actually can hit the "good enough" point for text prediction with very little data. 80GB of low-quality (but ethically sourced from $HOME/logs) training data yielded a bot that can compose english and french prose reasonably well. if i additionally trained it on a creative commons licensed source like a wikipedia dump, it would probably be *way* more than enough. i don't have the compute power to do that though.

                    second: reasoning models seem to largely be "mixture of experts" which are just more LLMs bolted on to each other. there's some cool consensus stuff going on, but that's all there is. this could possibly be considered a form of "thinking" in the framing of minsky's society of mind, but i don't think there is enough here that i would want to invest in companies doing this long term.

                    third: from my own experiences teaching my LLM how to use tools, i can tell you that claude code and openai codex are just chatbots with a really well-written system prompt backed by a "mixture of experts" model. it is like that one scene where neo unlocks god mode in the matrix, i see how all this bullshit works now. (there is still a lot i do not know about the specifics, but i'm a person who works on the fuzzy side of things so it does not matter).

                    fourth: i built my own LLM with a threadripper, some IRC logs gathered from various hard drives, a $10k GPU, a look at the qwen3 training scripts (i have Opinions on py3-transformers) and few days of training. it is pretty capable of generating plausible text. what is the big intellectual property asset that OpenAI has that the little guys can't duplicate? if i can do it in my condo, a startup can certainly compete with OpenAI.

                    given these things, I really just don't understand how it is justifiable for all of this AI stuff to be some double-digit % of global GDP.

                    if anything, i just have stronger conviction in that now.

                    iswyrm@mastodon.unoI This user is from outside of this forum
                    iswyrm@mastodon.unoI This user is from outside of this forum
                    iswyrm@mastodon.uno
                    wrote last edited by
                    #47

                    @ariadne I do not talk as an educated in the field, but my wild guess, the AI craze is like the evolution of cloud computing business model that some corporations are running from a decade or more.
                    A way to move workflow into their services even when this workflow could be done offline.

                    1 Reply Last reply
                    0
                    • ariadne@social.treehouse.systemsA ariadne@social.treehouse.systems

                      @pixx @mirth i think small LLMs do not really have an ethical problem: i trained a 1.3B parameter LLM off of my own personal data in my apartment by simply being patient enough to wait. no copyright violations, no boiling oceans, just patience and a professional workstation GPU with 96GB RAM.

                      the ethical problem is with the Big AI companies who feel that the only path forward is to make bigger and bigger and bigger monolithic prediction models rather than properly engineer the damn thing.

                      that same ethical problem is driving the hoarding, because companies are buying the hardware to prevent their competitors from having it IMO.

                      pixx@merveilles.townP This user is from outside of this forum
                      pixx@merveilles.townP This user is from outside of this forum
                      pixx@merveilles.town
                      wrote last edited by
                      #48

                      @ariadne
                      Yeah the hoarding one seems pretty obvious

                      I wonder whether openai can affkrd to hold onto so many chips for more than a year orntwo...
                      @mirth

                      1 Reply Last reply
                      0
                      • ariadne@social.treehouse.systemsA ariadne@social.treehouse.systems

                        first of all, when i began i was quite skeptical on commercial AI.

                        this exercise has only made me more skeptical, for a few reasons:

                        first: you actually can hit the "good enough" point for text prediction with very little data. 80GB of low-quality (but ethically sourced from $HOME/logs) training data yielded a bot that can compose english and french prose reasonably well. if i additionally trained it on a creative commons licensed source like a wikipedia dump, it would probably be *way* more than enough. i don't have the compute power to do that though.

                        second: reasoning models seem to largely be "mixture of experts" which are just more LLMs bolted on to each other. there's some cool consensus stuff going on, but that's all there is. this could possibly be considered a form of "thinking" in the framing of minsky's society of mind, but i don't think there is enough here that i would want to invest in companies doing this long term.

                        third: from my own experiences teaching my LLM how to use tools, i can tell you that claude code and openai codex are just chatbots with a really well-written system prompt backed by a "mixture of experts" model. it is like that one scene where neo unlocks god mode in the matrix, i see how all this bullshit works now. (there is still a lot i do not know about the specifics, but i'm a person who works on the fuzzy side of things so it does not matter).

                        fourth: i built my own LLM with a threadripper, some IRC logs gathered from various hard drives, a $10k GPU, a look at the qwen3 training scripts (i have Opinions on py3-transformers) and few days of training. it is pretty capable of generating plausible text. what is the big intellectual property asset that OpenAI has that the little guys can't duplicate? if i can do it in my condo, a startup can certainly compete with OpenAI.

                        given these things, I really just don't understand how it is justifiable for all of this AI stuff to be some double-digit % of global GDP.

                        if anything, i just have stronger conviction in that now.

                        wombatpandaa@mastodon.socialW This user is from outside of this forum
                        wombatpandaa@mastodon.socialW This user is from outside of this forum
                        wombatpandaa@mastodon.social
                        wrote last edited by
                        #49

                        @ariadne I've been skeptical of it from the beginning as well - in part because of a delightfully weird project called Neuro. She's an AI virtual YouTuber who can autonomously stream, sing karaoke, play Minecraft, interact with guests, call and message friends on discord, talk to her chat, and more, all before the recent LLM boom. Which corporation was responsible for this marvel of modern engineering? None of them. A single British dude made her out of an osu! bot because he felt like it.

                        1 Reply Last reply
                        0
                        • ariadne@social.treehouse.systemsA ariadne@social.treehouse.systems

                          @pixx @mirth i think small LLMs do not really have an ethical problem: i trained a 1.3B parameter LLM off of my own personal data in my apartment by simply being patient enough to wait. no copyright violations, no boiling oceans, just patience and a professional workstation GPU with 96GB RAM.

                          the ethical problem is with the Big AI companies who feel that the only path forward is to make bigger and bigger and bigger monolithic prediction models rather than properly engineer the damn thing.

                          that same ethical problem is driving the hoarding, because companies are buying the hardware to prevent their competitors from having it IMO.

                          pixx@merveilles.townP This user is from outside of this forum
                          pixx@merveilles.townP This user is from outside of this forum
                          pixx@merveilles.town
                          wrote last edited by
                          #50

                          @ariadne
                          Mostly agree, but mosy purposes for automated text generation that I've seen are either toys or evil
                          @mirth

                          ariadne@social.treehouse.systemsA 1 Reply Last reply
                          0
                          • pixx@merveilles.townP pixx@merveilles.town

                            @ariadne
                            Mostly agree, but mosy purposes for automated text generation that I've seen are either toys or evil
                            @mirth

                            ariadne@social.treehouse.systemsA This user is from outside of this forum
                            ariadne@social.treehouse.systemsA This user is from outside of this forum
                            ariadne@social.treehouse.systems
                            wrote last edited by
                            #51

                            @pixx @mirth yes, i agree that the main usecase for automated text generation is antisocial stuff like spam. what i am pursuing is more "language as I/O" than text generation. think Siri.

                            1 Reply Last reply
                            0
                            • R relay@relay.infosec.exchange shared this topic
                            Reply
                            • Reply as topic
                            Log in to reply
                            • Oldest to Newest
                            • Newest to Oldest
                            • Most Votes


                            • Login

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • World
                            • Users
                            • Groups