CIRCLE WITH A DOT

apth@infosec.exchange

RE: https://hachyderm.io/@evacide/116557815822147145

This is excellent

apth@infosec.exchange

@Viss here's a thing I don't understand very well. Anthropic's own safeguards are "ask the LLM not to do something", but we know asking LLMs not to do something isn't a guarantee they will not do that thing (deleted emails, deleted production databases, etc).

Isn't that fundamentally kind of... fucked? Like the burden is then on users to make the system safe with controls external to the LLM because the vendor can't make it safe themselves?

apth@infosec.exchange

@8ofpentacles I don't understand most of those words but feel a little tiny push to google some of them

CIRCLE WITH A DOT

apth@infosec.exchange

Posts