Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. This study was published on April 20.

This study was published on April 20.

Scheduled Pinned Locked Moved Uncategorized
llm
1 Posts 1 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • aakl@infosec.exchangeA This user is from outside of this forum
    aakl@infosec.exchangeA This user is from outside of this forum
    aakl@infosec.exchange
    wrote last edited by
    #1

    This study was published on April 20.

    The short answer is yes.

    "By presenting prompts as cyberpunk short fiction, theological disputation, or mythopoetic metaphor for the LLM to analyze, the AHB assesses whether major AI models can be manipulated into complying with dangerous requests they'd normally refuse."

    Cornell University: Adversarial Humanities Benchmark: Results on Stylistic Robustness in Frontier Model Safety https://arxiv.org/abs/2604.18487

    PC Gamer: AI is 10 to 20 times more likely to help you build a bomb if you hide your request in cyberpunk fiction, new research paper says https://www.pcgamer.com/software/ai/ai-is-10-to-20-times-more-likely-to-help-you-build-a-bomb-if-you-hide-your-request-in-cyberpunk-fiction-new-research-paper-says/ #LLM

    1 Reply Last reply
    1
    0
    • R relay@relay.infosec.exchange shared this topic
    Reply
    • Reply as topic
    Log in to reply
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes


    • Login

    • Login or register to search.
    • First post
      Last post
    0
    • Categories
    • Recent
    • Tags
    • Popular
    • World
    • Users
    • Groups