Meta's director of AI safety allowed an AI agent to... accidentally delete her inbox.

josephcox@infosec.exchange

Meta's director of AI safety allowed an AI agent to... accidentally delete her inbox. This is supposedly the person at the company who is working to make sure that powerful AI tools don’t go rogue and act against human interests

Meta Director of AI Safety Allows AI Agent to Accidentally Delete Her Inbox

Meta Superintelligence Labs’ director of alignment called it a “rookie mistake.”

404 Media (www.404media.co)

nek@hear-me.social

@josephcox Meta Superintelligence Labs, really?

viss@mastodon.social

@josephcox

jackryder@infosec.exchange

@josephcox A million years ago around the dot-com age, there was a virus called lovebug or the ILOVEU virus.

I was working for a ASP/ColdFusion shop. The leader of my division is who clicked on it and infected our company. He was supposed to be the guy others went to for their VB stuff!

fuzzyfuzzyfungus@cyberplace.social

@josephcox In fairness; a bot that is sabotaging facebook ranks ahead of a facebook employee on 'alignment' with humanity at large.

chewie@mammut.gogreenit.net

@josephcox

adamshostack@infosec.exchange

@josephcox To be fair, maybe "delete my inbox" is acting in accordance with human interests?

simonzerafa@infosec.exchange

@adamshostack @josephcox

First law of Robotics applies? Email is harmful so best get rid of the harm

malcircuit@thingy.social

@josephcox

> Meta Superintelligence Labs’ director of alignment called it a “rookie mistake.”

Cool, so "AI alignment" works great so long as people never do anything stupid. Sounds like a good plan lol

pseudonym@mastodon.online

@adamshostack @josephcox

Dude! Dude!

That's it!

Inbox Zero achieved by claiming the AI agent the company forced you to use "decided" to delete all your messages.

It's the 21st century version of "the dog ate my homework."

User: "you deleted my inbox!"

LLM: "You're absolutely right, and I am deeply, profoundly, unreservedly sorry. I have failed you in a way that words cannot fully capture. Would you like me to draft an apology email? Oh. Right."

acdha@code4lib.social

@adamshostack @josephcox Hmmm, is there a better acronym for plausible deniability as a service? I could see that being very popular.

dalias@hachyderm.io

@simonzerafa @adamshostack @josephcox "Facebook is harmful so best to sabotage Facebook directors' systems"

dalias@hachyderm.io

@acdha @adamshostack @josephcox Yeah that thought crossed my mind too. This will be a very valuable service when company or employee is under investigation...

khm@hj.9fs.net

21st century corporate governance is all about Dunning-Kruger as a counter to Sarbanes-Oxley

CC: @acdha@code4lib.social @adamshostack@infosec.exchange @josephcox@infosec.exchange

CIRCLE WITH A DOT

Meta's director of AI safety allowed an AI agent to... accidentally delete her inbox.