Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. gemma 4 e4b isn't half shabby, but i didn't think it would run in llama.cpp-vulkan in ubuntu on this lenovo yoga laptop with an AMD Radeon 860M GPU.

gemma 4 e4b isn't half shabby, but i didn't think it would run in llama.cpp-vulkan in ubuntu on this lenovo yoga laptop with an AMD Radeon 860M GPU.

Scheduled Pinned Locked Moved Uncategorized
s0up
56 Posts 7 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • lritter@mastodon.gamedev.placeL lritter@mastodon.gamedev.place

    @neo you should know it's not the size that matters. 😏

    neo@soc.psynet.meN This user is from outside of this forum
    neo@soc.psynet.meN This user is from outside of this forum
    neo@soc.psynet.me
    wrote last edited by
    #24

    @lritter Yeah yeah, just use your VRAM smarter. That's what Nvidia said when they released another 8 GB card. 😜

    1 Reply Last reply
    0
    • lritter@mastodon.gamedev.placeL lritter@mastodon.gamedev.place

      @allo

      - i'm aware. this is all new. new llama, new files. i use the exact temperature, top k etc. config as suggested by the vendor. examples in this thread were all 26b based. 34b is too slow for tools.

      - i would rather have my fingernails pulled out than put this in a IDE and compromise integrity & copyright. this is strictly entertainment.

      - i doubt the speed is the same. i'm going to try a qwen 3.5 35B A3B, let's see if it can understand my work. i doubt it.

      - agree on e4b.

      lritter@mastodon.gamedev.placeL This user is from outside of this forum
      lritter@mastodon.gamedev.placeL This user is from outside of this forum
      lritter@mastodon.gamedev.place
      wrote last edited by
      #25

      @allo i set up the qwen model i mentioned with the settings recommended for coding work. it is slower but not impossibly slow. 12t/s

      i had it examine the nudl directory, read the sx docs, etc.

      tutorial is also full-blown wrong.

      (fun fact: when i scolded gemma for the bad quality of it earlier, it wrote it again, and this time, more things were correct.)

      but this is a joke. i expect one shot perfection.

      lritter@mastodon.gamedev.placeL 1 Reply Last reply
      0
      • lritter@mastodon.gamedev.placeL lritter@mastodon.gamedev.place

        @allo i set up the qwen model i mentioned with the settings recommended for coding work. it is slower but not impossibly slow. 12t/s

        i had it examine the nudl directory, read the sx docs, etc.

        tutorial is also full-blown wrong.

        (fun fact: when i scolded gemma for the bad quality of it earlier, it wrote it again, and this time, more things were correct.)

        but this is a joke. i expect one shot perfection.

        lritter@mastodon.gamedev.placeL This user is from outside of this forum
        lritter@mastodon.gamedev.placeL This user is from outside of this forum
        lritter@mastodon.gamedev.place
        wrote last edited by
        #26

        @allo i also told qwen it did a bad job and now it wants to know what it did wrong? if i could only explain, it would understand.

        goes to show: these models can only help you when you're not doing anything interesting.

        lritter@mastodon.gamedev.placeL 1 Reply Last reply
        0
        • lritter@mastodon.gamedev.placeL lritter@mastodon.gamedev.place

          @allo i also told qwen it did a bad job and now it wants to know what it did wrong? if i could only explain, it would understand.

          goes to show: these models can only help you when you're not doing anything interesting.

          lritter@mastodon.gamedev.placeL This user is from outside of this forum
          lritter@mastodon.gamedev.placeL This user is from outside of this forum
          lritter@mastodon.gamedev.place
          wrote last edited by
          #27

          @allo qwen 3.5

          Link Preview Image
          allo@chaos.socialA 1 Reply Last reply
          0
          • lritter@mastodon.gamedev.placeL lritter@mastodon.gamedev.place

            @allo qwen 3.5

            Link Preview Image
            allo@chaos.socialA This user is from outside of this forum
            allo@chaos.socialA This user is from outside of this forum
            allo@chaos.social
            wrote last edited by
            #28

            @lritter I am not sure what frontend you are using there. I think one of the advantages of kilocode (or roo) is that it provides good tools for dissecting the source and thought out system prompts. A one-shot in the web interface doesn't do the same than a command in kilocode.

            Yeah, 27B/34B dense are too slow for me, too, but the MoE work for me. I need to reevaluate Gemma 4 after the latest fixes, it may now perform better.

            And I guess having AI work with a novel programming language is hard.

            allo@chaos.socialA lritter@mastodon.gamedev.placeL 2 Replies Last reply
            0
            • allo@chaos.socialA allo@chaos.social

              @lritter I am not sure what frontend you are using there. I think one of the advantages of kilocode (or roo) is that it provides good tools for dissecting the source and thought out system prompts. A one-shot in the web interface doesn't do the same than a command in kilocode.

              Yeah, 27B/34B dense are too slow for me, too, but the MoE work for me. I need to reevaluate Gemma 4 after the latest fixes, it may now perform better.

              And I guess having AI work with a novel programming language is hard.

              allo@chaos.socialA This user is from outside of this forum
              allo@chaos.socialA This user is from outside of this forum
              allo@chaos.social
              wrote last edited by
              #29

              @lritter For the rest: I know you are not too fond of LLMs or AI, and I guess we don't need to discuss this in detail. But for me, they do well within the range that one can expect of them, even for one-shotting medium sized scripts.

              My take is that these things won't go away, so one should take what's useful and leave the rest. And don't fall for the hyped things like Openclaw.

              1 Reply Last reply
              0
              • allo@chaos.socialA allo@chaos.social

                @lritter I am not sure what frontend you are using there. I think one of the advantages of kilocode (or roo) is that it provides good tools for dissecting the source and thought out system prompts. A one-shot in the web interface doesn't do the same than a command in kilocode.

                Yeah, 27B/34B dense are too slow for me, too, but the MoE work for me. I need to reevaluate Gemma 4 after the latest fixes, it may now perform better.

                And I guess having AI work with a novel programming language is hard.

                lritter@mastodon.gamedev.placeL This user is from outside of this forum
                lritter@mastodon.gamedev.placeL This user is from outside of this forum
                lritter@mastodon.gamedev.place
                wrote last edited by
                #30

                @allo it's because they are not really good at doing mental transfer work themselves. they are not intelligent in any meaningful way. they just know what fits best. for many tasks, that is exactly what you want. but when it comes to what *feels* best... they're just like high functioning autists doing a hell of a masking job.

                allo@chaos.socialA 1 Reply Last reply
                0
                • lritter@mastodon.gamedev.placeL lritter@mastodon.gamedev.place

                  @allo it's because they are not really good at doing mental transfer work themselves. they are not intelligent in any meaningful way. they just know what fits best. for many tasks, that is exactly what you want. but when it comes to what *feels* best... they're just like high functioning autists doing a hell of a masking job.

                  allo@chaos.socialA This user is from outside of this forum
                  allo@chaos.socialA This user is from outside of this forum
                  allo@chaos.social
                  wrote last edited by
                  #31

                  @lritter I've once read they are a multiplier. Making the dumb people dumber and the clever people more clever.

                  Like you can outsource things and blindly believe the output and fail hard, or you know exactly how to use them and speed up your work a lot.

                  Another interesting aspect: First people reported burnout from using LLMs, because they are much more productive, and that led to doing much more in a day than they would when doing things themselves, while the work is still mentally straining.

                  lritter@mastodon.gamedev.placeL allo@chaos.socialA 2 Replies Last reply
                  0
                  • allo@chaos.socialA allo@chaos.social

                    @lritter I've once read they are a multiplier. Making the dumb people dumber and the clever people more clever.

                    Like you can outsource things and blindly believe the output and fail hard, or you know exactly how to use them and speed up your work a lot.

                    Another interesting aspect: First people reported burnout from using LLMs, because they are much more productive, and that led to doing much more in a day than they would when doing things themselves, while the work is still mentally straining.

                    lritter@mastodon.gamedev.placeL This user is from outside of this forum
                    lritter@mastodon.gamedev.placeL This user is from outside of this forum
                    lritter@mastodon.gamedev.place
                    wrote last edited by
                    #32

                    @allo i know of that aspect.

                    > Making the dumb people dumber and the clever people more clever.

                    yes but which of the two am i!

                    allo@chaos.socialA 1 Reply Last reply
                    0
                    • allo@chaos.socialA allo@chaos.social

                      @lritter I've once read they are a multiplier. Making the dumb people dumber and the clever people more clever.

                      Like you can outsource things and blindly believe the output and fail hard, or you know exactly how to use them and speed up your work a lot.

                      Another interesting aspect: First people reported burnout from using LLMs, because they are much more productive, and that led to doing much more in a day than they would when doing things themselves, while the work is still mentally straining.

                      allo@chaos.socialA This user is from outside of this forum
                      allo@chaos.socialA This user is from outside of this forum
                      allo@chaos.social
                      wrote last edited by
                      #33

                      @lritter
                      The AI assisted 10x engineer, I guess.

                      lritter@mastodon.gamedev.placeL 1 Reply Last reply
                      0
                      • allo@chaos.socialA allo@chaos.social

                        @lritter
                        The AI assisted 10x engineer, I guess.

                        lritter@mastodon.gamedev.placeL This user is from outside of this forum
                        lritter@mastodon.gamedev.placeL This user is from outside of this forum
                        lritter@mastodon.gamedev.place
                        wrote last edited by
                        #34

                        @allo all this sounds more like mythbuilding to me than truth.

                        allo@chaos.socialA 1 Reply Last reply
                        0
                        • lritter@mastodon.gamedev.placeL lritter@mastodon.gamedev.place

                          @allo i know of that aspect.

                          > Making the dumb people dumber and the clever people more clever.

                          yes but which of the two am i!

                          allo@chaos.socialA This user is from outside of this forum
                          allo@chaos.socialA This user is from outside of this forum
                          allo@chaos.social
                          wrote last edited by
                          #35

                          @lritter
                          Be the zero, its not affected by multipliers! πŸ™‚

                          1 Reply Last reply
                          0
                          • lritter@mastodon.gamedev.placeL lritter@mastodon.gamedev.place

                            @allo all this sounds more like mythbuilding to me than truth.

                            allo@chaos.socialA This user is from outside of this forum
                            allo@chaos.socialA This user is from outside of this forum
                            allo@chaos.social
                            wrote last edited by
                            #36

                            @lritter
                            No idea, butI think it is plausibel that doing more even with a tool is more stressful than doing less by hand. I think it was particularly about coding work.

                            lritter@mastodon.gamedev.placeL 1 Reply Last reply
                            0
                            • allo@chaos.socialA allo@chaos.social

                              @lritter
                              No idea, butI think it is plausibel that doing more even with a tool is more stressful than doing less by hand. I think it was particularly about coding work.

                              lritter@mastodon.gamedev.placeL This user is from outside of this forum
                              lritter@mastodon.gamedev.placeL This user is from outside of this forum
                              lritter@mastodon.gamedev.place
                              wrote last edited by
                              #37

                              @allo well it turns you into a bit of a CEO. so it would be logical that you get the same problems as one. which predicts an eventual coke habit πŸ˜‰

                              1 Reply Last reply
                              0
                              • lritter@mastodon.gamedev.placeL lritter@mastodon.gamedev.place

                                jarvis, err i mean gemma can now do the original example i proposed.

                                i added tools to:
                                * get date and time
                                * write to file in a special bucket dir
                                * append to file in the bucket dir
                                * read files (completely)
                                * change directory
                                * list directory

                                it was pretty useless in understanding my language projects. i asked it to write a tutorial for nudl and despite seeing several examples, it used tokens from C++ and python.

                                the future - today!

                                #s0up

                                Link Preview Image
                                stompyrobot@mastodon.gamedev.placeS This user is from outside of this forum
                                stompyrobot@mastodon.gamedev.placeS This user is from outside of this forum
                                stompyrobot@mastodon.gamedev.place
                                wrote last edited by
                                #38

                                @lritter Gemma is a very small model.
                                Did you try asking opus to write a tutorial in the same repository?
                                And, because it's computers, then ask it to verify and correct itself?
                                (That's currently the state of the art in how to get useful stuff out. Why can't it do it automatically? IDK!)

                                lritter@mastodon.gamedev.placeL 1 Reply Last reply
                                0
                                • stompyrobot@mastodon.gamedev.placeS stompyrobot@mastodon.gamedev.place

                                  @lritter Gemma is a very small model.
                                  Did you try asking opus to write a tutorial in the same repository?
                                  And, because it's computers, then ask it to verify and correct itself?
                                  (That's currently the state of the art in how to get useful stuff out. Why can't it do it automatically? IDK!)

                                  lritter@mastodon.gamedev.placeL This user is from outside of this forum
                                  lritter@mastodon.gamedev.placeL This user is from outside of this forum
                                  lritter@mastodon.gamedev.place
                                  wrote last edited by
                                  #39

                                  @StompyRobot but you see the problem in asking a politician to investigate their own dealings yes?

                                  stompyrobot@mastodon.gamedev.placeS 1 Reply Last reply
                                  0
                                  • lritter@mastodon.gamedev.placeL lritter@mastodon.gamedev.place

                                    @StompyRobot but you see the problem in asking a politician to investigate their own dealings yes?

                                    stompyrobot@mastodon.gamedev.placeS This user is from outside of this forum
                                    stompyrobot@mastodon.gamedev.placeS This user is from outside of this forum
                                    stompyrobot@mastodon.gamedev.place
                                    wrote last edited by
                                    #40

                                    @lritter
                                    Models aren't conscious, don't have volition, and aren't trained to have self preservation behavior. They are surprisingly OK at diagnosing their own output when given specific instructions!

                                    Programming them is a whole new way of thinking, but they *can* be made into a useful part of a useful system.

                                    As you note, we're still being much in a "batteries not included" early stage, despite boosters claiming it's all done.

                                    lritter@mastodon.gamedev.placeL 1 Reply Last reply
                                    0
                                    • stompyrobot@mastodon.gamedev.placeS stompyrobot@mastodon.gamedev.place

                                      @lritter
                                      Models aren't conscious, don't have volition, and aren't trained to have self preservation behavior. They are surprisingly OK at diagnosing their own output when given specific instructions!

                                      Programming them is a whole new way of thinking, but they *can* be made into a useful part of a useful system.

                                      As you note, we're still being much in a "batteries not included" early stage, despite boosters claiming it's all done.

                                      lritter@mastodon.gamedev.placeL This user is from outside of this forum
                                      lritter@mastodon.gamedev.placeL This user is from outside of this forum
                                      lritter@mastodon.gamedev.place
                                      wrote last edited by
                                      #41

                                      @StompyRobot did you just lazily outsource your rebuttal to the machine? πŸ˜‰

                                      you know what i mean. if the machine makes mistakes generating, it will make mistakes verifying (whose output is also generation)

                                      stompyrobot@mastodon.gamedev.placeS 1 Reply Last reply
                                      0
                                      • lritter@mastodon.gamedev.placeL lritter@mastodon.gamedev.place

                                        @StompyRobot did you just lazily outsource your rebuttal to the machine? πŸ˜‰

                                        you know what i mean. if the machine makes mistakes generating, it will make mistakes verifying (whose output is also generation)

                                        stompyrobot@mastodon.gamedev.placeS This user is from outside of this forum
                                        stompyrobot@mastodon.gamedev.placeS This user is from outside of this forum
                                        stompyrobot@mastodon.gamedev.place
                                        wrote last edited by
                                        #42

                                        @lritter
                                        What I'm saying is that that's not at all as certain as with people.
                                        Or, to put another way, the prompt is a hash function into one of billions of possible programs stored in the model, and you'll get different bugs with a different prompt.
                                        Getting the same model to work on the same problem in three different ways absolutely increases the rate of correctness, especially if you make a "best two of three" kind of setup.
                                        It's really quite counter intuitive that it should work!

                                        lritter@mastodon.gamedev.placeL 1 Reply Last reply
                                        0
                                        • stompyrobot@mastodon.gamedev.placeS stompyrobot@mastodon.gamedev.place

                                          @lritter
                                          What I'm saying is that that's not at all as certain as with people.
                                          Or, to put another way, the prompt is a hash function into one of billions of possible programs stored in the model, and you'll get different bugs with a different prompt.
                                          Getting the same model to work on the same problem in three different ways absolutely increases the rate of correctness, especially if you make a "best two of three" kind of setup.
                                          It's really quite counter intuitive that it should work!

                                          lritter@mastodon.gamedev.placeL This user is from outside of this forum
                                          lritter@mastodon.gamedev.placeL This user is from outside of this forum
                                          lritter@mastodon.gamedev.place
                                          wrote last edited by
                                          #43

                                          @StompyRobot and this is supposed to be good?

                                          stompyrobot@mastodon.gamedev.placeS 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups