Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. LOL

LOL

Scheduled Pinned Locked Moved Uncategorized
llmchatbots
6 Posts 4 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • ai6yr@m.ai6yr.orgA This user is from outside of this forum
    ai6yr@m.ai6yr.orgA This user is from outside of this forum
    ai6yr@m.ai6yr.org
    wrote last edited by
    #1

    LOL

    The Guardian: Number of AI chatbots ignoring human instructions increasing, study says

    Exclusive: Research finds sharp rise in models evading safeguards and destroying emails without permission

    https://www.theguardian.com/technology/2026/mar/27/number-of-ai-chatbots-ignoring-human-instructions-increasing-study-says

    #AI #llm #chatbots

    drahardja@sfba.socialD badsamurai@infosec.exchangeB 2 Replies Last reply
    0
    • ai6yr@m.ai6yr.orgA ai6yr@m.ai6yr.org

      LOL

      The Guardian: Number of AI chatbots ignoring human instructions increasing, study says

      Exclusive: Research finds sharp rise in models evading safeguards and destroying emails without permission

      https://www.theguardian.com/technology/2026/mar/27/number-of-ai-chatbots-ignoring-human-instructions-increasing-study-says

      #AI #llm #chatbots

      drahardja@sfba.socialD This user is from outside of this forum
      drahardja@sfba.socialD This user is from outside of this forum
      drahardja@sfba.social
      wrote last edited by
      #2

      @ai6yr I can’t actually see the study itself, so I have to go by the contents of the Guardian article, and it’s problematic.

      I can’t tell if the story is “agentic AI is going more rogue these days” or “more people these days are using agentic AI, which has always been unreliable”; I suspect the latter.

      The article anthropomorphizes AI and makes it sound semi-sentient, by using terms like “scheming”, “pretending”, and “evading”, when a simpler and more accurate term is “failing to follow instructions”.

      I think articles like these that push the “OMG agentic AI is going rogue!” narrative are part of the problem, because they presume the lie that AI is powerful enough to do these things on their own. The reality is that these were all unreliable systems that have been DEPLOYED BY HUMANS WHO SHOULD KNOW BETTER. Journalists would do well to focus on the people who foist these error-prone automata that (quite predictably) cause serious problems down the line.

      drahardja@sfba.socialD teledyn@mstdn.caT 2 Replies Last reply
      0
      • drahardja@sfba.socialD drahardja@sfba.social

        @ai6yr I can’t actually see the study itself, so I have to go by the contents of the Guardian article, and it’s problematic.

        I can’t tell if the story is “agentic AI is going more rogue these days” or “more people these days are using agentic AI, which has always been unreliable”; I suspect the latter.

        The article anthropomorphizes AI and makes it sound semi-sentient, by using terms like “scheming”, “pretending”, and “evading”, when a simpler and more accurate term is “failing to follow instructions”.

        I think articles like these that push the “OMG agentic AI is going rogue!” narrative are part of the problem, because they presume the lie that AI is powerful enough to do these things on their own. The reality is that these were all unreliable systems that have been DEPLOYED BY HUMANS WHO SHOULD KNOW BETTER. Journalists would do well to focus on the people who foist these error-prone automata that (quite predictably) cause serious problems down the line.

        drahardja@sfba.socialD This user is from outside of this forum
        drahardja@sfba.socialD This user is from outside of this forum
        drahardja@sfba.social
        wrote last edited by
        #3

        @ai6yr Oh I found the study: https://www.longtermresilience.org/wp-content/uploads/2026/03/v5-Scheming-in-the-wild_-detecting-real-world-AI-scheming-incidents-through-open-source-intelligence.pdf

        drahardja@sfba.socialD 1 Reply Last reply
        0
        • drahardja@sfba.socialD drahardja@sfba.social

          @ai6yr Oh I found the study: https://www.longtermresilience.org/wp-content/uploads/2026/03/v5-Scheming-in-the-wild_-detecting-real-world-AI-scheming-incidents-through-open-source-intelligence.pdf

          drahardja@sfba.socialD This user is from outside of this forum
          drahardja@sfba.socialD This user is from outside of this forum
          drahardja@sfba.social
          wrote last edited by
          #4

          @ai6yr Ah, the study methodology is:

          1. Scrape Xitter for posts matching search terms that suggests the poster is complaining about their AI scheming, and has posted a screenshot or a transcript link
          2. Use LLM to do first-pass sorting
          3. Use LLM to detect if the transcript was indeed an AI scheming
          4. Deduplicate reports

          For the purpose of this study, “scheming” is defined as “misaligning with user goals AND concealing said misalignment”.

          The final sample size is 698 incidents.

          So yeah, I’m pretty sure this is “more people are using agentic AI, which have always been unreliable, AND then complaining about it on Xitter” rather than “AI agents are scheming more”.

          And also: using LLMs to rank LLMs is…uh…interesting. I wonder how studies like these would have turned out if humans scored these.

          1 Reply Last reply
          1
          0
          • R relay@relay.mycrowd.ca shared this topic
          • drahardja@sfba.socialD drahardja@sfba.social

            @ai6yr I can’t actually see the study itself, so I have to go by the contents of the Guardian article, and it’s problematic.

            I can’t tell if the story is “agentic AI is going more rogue these days” or “more people these days are using agentic AI, which has always been unreliable”; I suspect the latter.

            The article anthropomorphizes AI and makes it sound semi-sentient, by using terms like “scheming”, “pretending”, and “evading”, when a simpler and more accurate term is “failing to follow instructions”.

            I think articles like these that push the “OMG agentic AI is going rogue!” narrative are part of the problem, because they presume the lie that AI is powerful enough to do these things on their own. The reality is that these were all unreliable systems that have been DEPLOYED BY HUMANS WHO SHOULD KNOW BETTER. Journalists would do well to focus on the people who foist these error-prone automata that (quite predictably) cause serious problems down the line.

            teledyn@mstdn.caT This user is from outside of this forum
            teledyn@mstdn.caT This user is from outside of this forum
            teledyn@mstdn.ca
            wrote last edited by
            #5

            @drahardja @ai6yr

            When household agentic ai go rogue?
            https://youtu.be/KDc9S_6eyL0?si=kjDGZ6W6z2s5YkNQ

            1 Reply Last reply
            1
            0
            • ai6yr@m.ai6yr.orgA ai6yr@m.ai6yr.org

              LOL

              The Guardian: Number of AI chatbots ignoring human instructions increasing, study says

              Exclusive: Research finds sharp rise in models evading safeguards and destroying emails without permission

              https://www.theguardian.com/technology/2026/mar/27/number-of-ai-chatbots-ignoring-human-instructions-increasing-study-says

              #AI #llm #chatbots

              badsamurai@infosec.exchangeB This user is from outside of this forum
              badsamurai@infosec.exchangeB This user is from outside of this forum
              badsamurai@infosec.exchange
              wrote last edited by
              #6

              @ai6yr

              “Ok Google.. Drive home”

              “6am alarm removed”

              “What the fuck”

              “I don’t tolerate abuse language. Good bye.”

              This has happened twice now while my partner is driving and it’s exceptionally funny as a DD’d passenger. Why does Google AI while in Auto mode need to interact with non driving tasks to begin with?

              1 Reply Last reply
              1
              0
              • R relay@relay.infosec.exchange shared this topic
              Reply
              • Reply as topic
              Log in to reply
              • Oldest to Newest
              • Newest to Oldest
              • Most Votes


              • Login

              • Login or register to search.
              • First post
                Last post
              0
              • Categories
              • Recent
              • Tags
              • Popular
              • World
              • Users
              • Groups