Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. This is a fun one: https://arxiv.org/abs/2305.04388

This is a fun one: https://arxiv.org/abs/2305.04388

Scheduled Pinned Locked Moved Uncategorized
sciencellm
3 Posts 1 Posters 6 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • ngaylinn@tech.lgbtN This user is from outside of this forum
    ngaylinn@tech.lgbtN This user is from outside of this forum
    ngaylinn@tech.lgbt
    wrote last edited by
    #1

    This is a fun one: https://arxiv.org/abs/2305.04388

    One more way LLMs appear human like: they faithfully reproduce cognitive bias, and give plausible, seemingly unbiased justifications for their biased answers.

    In this case, the biases they looked at were embedded in the structure of the dataset, in the prompt from the user, and from social stereotypes. They used "chain of thought" reasoning, which is supposed to force the LLM into a more rational, transparent "thought process" when generating its answers. They found they could systematically bias the LLM's output, and the LLM would never own up to that bias.

    (1/3)

    #science #llm #ai

    ngaylinn@tech.lgbtN 1 Reply Last reply
    0
    • ngaylinn@tech.lgbtN ngaylinn@tech.lgbt

      This is a fun one: https://arxiv.org/abs/2305.04388

      One more way LLMs appear human like: they faithfully reproduce cognitive bias, and give plausible, seemingly unbiased justifications for their biased answers.

      In this case, the biases they looked at were embedded in the structure of the dataset, in the prompt from the user, and from social stereotypes. They used "chain of thought" reasoning, which is supposed to force the LLM into a more rational, transparent "thought process" when generating its answers. They found they could systematically bias the LLM's output, and the LLM would never own up to that bias.

      (1/3)

      #science #llm #ai

      ngaylinn@tech.lgbtN This user is from outside of this forum
      ngaylinn@tech.lgbtN This user is from outside of this forum
      ngaylinn@tech.lgbt
      wrote last edited by
      #2

      One potential problem with this study is that the sample explanations they used to train the model never mentioned bias. So, perhaps they were "priming the LLM to lie" by not showing it how to fess up to bad influences.

      But there's a deeper point that I wish the paper had discussed. An LLM does not have the ability to introspect. It can't know what factors led it to give a particular answer. All it can see is the text it generated for its own "chain of thought." If that text was in an objective, proof-like setting, then each statement would follow logically from the previous one, and the LLM could judge its own reasoning. But the LLM simply can't in a setting where its output is influenced by information outside the CoT, which is... most of them.

      (2/3)

      #science #llm #ai

      ngaylinn@tech.lgbtN 1 Reply Last reply
      1
      0
      • ngaylinn@tech.lgbtN ngaylinn@tech.lgbt

        One potential problem with this study is that the sample explanations they used to train the model never mentioned bias. So, perhaps they were "priming the LLM to lie" by not showing it how to fess up to bad influences.

        But there's a deeper point that I wish the paper had discussed. An LLM does not have the ability to introspect. It can't know what factors led it to give a particular answer. All it can see is the text it generated for its own "chain of thought." If that text was in an objective, proof-like setting, then each statement would follow logically from the previous one, and the LLM could judge its own reasoning. But the LLM simply can't in a setting where its output is influenced by information outside the CoT, which is... most of them.

        (2/3)

        #science #llm #ai

        ngaylinn@tech.lgbtN This user is from outside of this forum
        ngaylinn@tech.lgbtN This user is from outside of this forum
        ngaylinn@tech.lgbt
        wrote last edited by
        #3

        This paper also illustrates a small exception: if the agent knows of a systematic bias it is susceptible to (ie, racial stereotypes) it can correct (or even overcorrect) its responses.

        This is fascinating to me, because it's so similar to human cognitive bias. Unlike an LLM, we have some degree of introspection, but we often can't see our own bias. Remembering that a bias exists, assuming you are susceptible to it, and correcting yourself even when you don't think you need to is often the best strategy.

        Unfortunately, our stereotypes around AI (mostly from SciFi) are that they are more rational and reliable than human beings. LLMs can only be less rational and reliable, because they are trained to mimic human performance, and they do so unreliably. They have access to more information, so in theory they could have better answers. But they also have more conflicting, incorrect, and fictional information, and this all gets blended together without in the training process.

        (3/3)

        #science #llm #ai

        1 Reply Last reply
        0
        • R relay@relay.an.exchange shared this topic
        Reply
        • Reply as topic
        Log in to reply
        • Oldest to Newest
        • Newest to Oldest
        • Most Votes


        • Login

        • Login or register to search.
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • World
        • Users
        • Groups