Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Going into the rabbithole of testing local LLMs right now.

Going into the rabbithole of testing local LLMs right now.

Scheduled Pinned Locked Moved Uncategorized
huggingfaceselfhostlocalaiollama
9 Posts 3 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • tomgag@infosec.exchangeT This user is from outside of this forum
    tomgag@infosec.exchangeT This user is from outside of this forum
    tomgag@infosec.exchange
    wrote on last edited by
    #1

    Going into the rabbithole of testing local LLMs right now. I don't have a dedicated GPU, but 32 GiB of RAM should be enough for anyone.

    #ai #huggingface #selfhost #localai #ollama #heretic #qwen #mistral

    tomgag@infosec.exchangeT 2 Replies Last reply
    0
    • tomgag@infosec.exchangeT tomgag@infosec.exchange

      Going into the rabbithole of testing local LLMs right now. I don't have a dedicated GPU, but 32 GiB of RAM should be enough for anyone.

      #ai #huggingface #selfhost #localai #ollama #heretic #qwen #mistral

      tomgag@infosec.exchangeT This user is from outside of this forum
      tomgag@infosec.exchangeT This user is from outside of this forum
      tomgag@infosec.exchange
      wrote on last edited by
      #2

      Heretic quantized versions of Qwen 3.5 have just been released but even the base Qwen 3.5 model seems to have issue with ollama currently, and I don't have bandwidth to do a manual patch now. Trying Mistral 3.2.

      tomgag@infosec.exchangeT 1 Reply Last reply
      0
      • tomgag@infosec.exchangeT tomgag@infosec.exchange

        Heretic quantized versions of Qwen 3.5 have just been released but even the base Qwen 3.5 model seems to have issue with ollama currently, and I don't have bandwidth to do a manual patch now. Trying Mistral 3.2.

        tomgag@infosec.exchangeT This user is from outside of this forum
        tomgag@infosec.exchangeT This user is from outside of this forum
        tomgag@infosec.exchange
        wrote on last edited by
        #3

        First impressions of Mistral Small 3.2: seems pretty solid, it answers "uncomfortable" political question quite neutrally.

        I don't understand why #confer and #euria by #infomaniak are not based on this.

        sealjay@fosstodon.orgS blingblingmk@dresden.networkB 2 Replies Last reply
        0
        • tomgag@infosec.exchangeT tomgag@infosec.exchange

          First impressions of Mistral Small 3.2: seems pretty solid, it answers "uncomfortable" political question quite neutrally.

          I don't understand why #confer and #euria by #infomaniak are not based on this.

          sealjay@fosstodon.orgS This user is from outside of this forum
          sealjay@fosstodon.orgS This user is from outside of this forum
          sealjay@fosstodon.org
          wrote on last edited by
          #4

          @tomgag how fast does it feel? I tried using foundry local and ollama but at the time I felt slowed down. I’d be keen to swap back to a local model given how the large providers are slowly catching down the subscription token limits.

          tomgag@infosec.exchangeT 1 Reply Last reply
          0
          • sealjay@fosstodon.orgS sealjay@fosstodon.org

            @tomgag how fast does it feel? I tried using foundry local and ollama but at the time I felt slowed down. I’d be keen to swap back to a local model given how the large providers are slowly catching down the subscription token limits.

            tomgag@infosec.exchangeT This user is from outside of this forum
            tomgag@infosec.exchangeT This user is from outside of this forum
            tomgag@infosec.exchange
            wrote on last edited by
            #5

            @sealjay well, I'm running on local CPU with 32 GiB of RAM, so I wouldn't call it "fast". 3-5 tokens per second maybe? I guess it's OK if you give it a task and then go to grab a coffee 😅

            sealjay@fosstodon.orgS 1 Reply Last reply
            0
            • tomgag@infosec.exchangeT tomgag@infosec.exchange

              @sealjay well, I'm running on local CPU with 32 GiB of RAM, so I wouldn't call it "fast". 3-5 tokens per second maybe? I guess it's OK if you give it a task and then go to grab a coffee 😅

              sealjay@fosstodon.orgS This user is from outside of this forum
              sealjay@fosstodon.orgS This user is from outside of this forum
              sealjay@fosstodon.org
              wrote on last edited by
              #6

              @tomgag maybe I’ll check I’m running on renewable energy before I leave a machine running over the weekend then 🤣

              1 Reply Last reply
              0
              • tomgag@infosec.exchangeT This user is from outside of this forum
                tomgag@infosec.exchangeT This user is from outside of this forum
                tomgag@infosec.exchange
                wrote on last edited by
                #7

                @1ad6e959c292f74de615d4c6e5ec43d0b7ec4908a55de93aa2527c46a8bd1d5b I'm not sure, I don't have any beefy GPU 😅 you shoulkd ask this in the Ollama Reddit community (or similar).

                1 Reply Last reply
                0
                • tomgag@infosec.exchangeT tomgag@infosec.exchange

                  Going into the rabbithole of testing local LLMs right now. I don't have a dedicated GPU, but 32 GiB of RAM should be enough for anyone.

                  #ai #huggingface #selfhost #localai #ollama #heretic #qwen #mistral

                  tomgag@infosec.exchangeT This user is from outside of this forum
                  tomgag@infosec.exchangeT This user is from outside of this forum
                  tomgag@infosec.exchange
                  wrote on last edited by
                  #8

                  Interesting, it seems that Qwen 2.5 Coder is actually less aggressive than Qwen 3.5 in rejecting sensitive topics.

                  Link Preview Image
                  1 Reply Last reply
                  1
                  0
                  • tomgag@infosec.exchangeT tomgag@infosec.exchange

                    First impressions of Mistral Small 3.2: seems pretty solid, it answers "uncomfortable" political question quite neutrally.

                    I don't understand why #confer and #euria by #infomaniak are not based on this.

                    blingblingmk@dresden.networkB This user is from outside of this forum
                    blingblingmk@dresden.networkB This user is from outside of this forum
                    blingblingmk@dresden.network
                    wrote on last edited by
                    #9

                    @tomgag
                    Good question! Why is #infomaniak not part of the fediverse?!

                    1 Reply Last reply
                    0
                    • R relay@relay.infosec.exchange shared this topic
                    Reply
                    • Reply as topic
                    Log in to reply
                    • Oldest to Newest
                    • Newest to Oldest
                    • Most Votes


                    • Login

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • World
                    • Users
                    • Groups