There used to be a time when building out a botnet required *some* work – writing exploits, taking over devices, obscuring the purpose of the executable, etc.

wakame@tech.lgbt

I guess you have to ask really, really nicely, to counteract the other instruction. Or simply add a "system reminder".

From a great and very enjoyable thread (for certain subcategories of "enjoyable"):

jonny (good kind) (@jonny@neuromatch.social)

Attached: 3 images i love this. there's a mechanism to slip secret messages to the LLM that it is told to interpret as system messages. there is no validation around these of any kind on the client, and there doesn't seem to be any differentiation about location or where these things happen, so that seems like a nice prompt injection vector. this is how claude code reminds the LLM to not do a malware, and it's applied by just string concatenation. i can't find any place that gets stripped aside from when displaying output. it actually looks like all the system reminders get catted together before being send to the API. neat!

neurospace.live (neuromatch.social)

rysiek@mstdn.social

@wakame @GreatBigTable ah yes, I've seen that in fact

greatbigtable@mastodon.social

@rysiek @wakame yeah. That one. So Anthropic's clutching of pearls over this happening is performative at best. They knew that this is possible because it is baked directly into the code. "You want to bypass these safe guards? Just say these magic words."

rysiek@mstdn.social

@GreatBigTable @wakame indeed, somehow I missed that initially. Thanks!

sloanlance@mastodon.social

@rysiek
If I were ever interested in experimenting with that kind of thing (I'm not), I would only do it in a virtual machine. To do otherwise is foolish.

purple@tech.lgbt

@rysiek wow, they are so casual about authentication just not existing, I mean wow

rysiek@mstdn.social

@sloanlance I really want to center OpenClaw's irresponsibility and negligence here though. They are actively promoting this to regular, non-techie people. And then when shit happens they blame the victim.

marcink@stolat.town

@rysiek But between this being openclaw and the insufferably LLM-ish tone of the blog post (pictured below) we can at least rest assured that there is a chance that no human being had to be involved in writing, editing, or reviewing these.

rysiek@mstdn.social

@marcink what a fantastic scene in that film.

marcink@stolat.town

@rysiek If there is any silver lining to this LLM bubble is that it will provide way more than enough material for a sequel.

fds@mastodon.social

@rysiek it’s a shame we still act like people are doing great things when they publish stuff like this.

rysiek@mstdn.social

@fds

(assuming "stuff like this" is OpenClaw, not the openClawCVEs repo)

CIRCLE WITH A DOT

There used to be a time when building out a botnet required some work – writing exploits, taking over devices, obscuring the purpose of the executable, etc.

jonny (good kind) (@jonny@neuromatch.social)

jonny (good kind) (@jonny@neuromatch.social)