There's a lot of discourse on Twitter about people using LLMs to solve CTF challenges.
-
@Alib234 @lina AIs are better than humans will ever be at chess and this was the case 20 years ago.
We ban AIs in chess.
It's a pain to detect but it's incredibly important for the integrity of the game.
And it about communicating norms and values too, "we don't want AI" is an incredibly different set of values than "we want AI in only xyz ways"
-
@nightwolf Yeah, I'm thinking mostly Jeopardy, which is the style I'm most familiar with. It just sucks to see that competition format completely break. I used to write a lot of challenges for that.
@lina Agreed. It will be interesting to see the next few years since Jeopardy format has been the most popular and easiest to implement.
-
And honestly, reading the Claude output, it's just ridiculous. It clearly has no idea what it's doing and it's just pattern-matching. Once it found the flag it spent 7 pages of reasoning and four more scripts trying to verify it, and failed to actually find what went wrong. It just concluded after all that time wasted that sometimes it gets the right answer and sometimes the wrong answer and so probably the flag that looks like a flag is the flag. It can't debug its own code to find out what actually went wrong, it just decided to brute force try again a different way.
It's just a pattern-matching machine. But it turns out if you brute force pattern-match enough times in enough steps inside a reasoning loop, you eventually stumble upon the answer, even if you have no idea how.
Humans can "wing it" and pattern-match too, but it's a gamble. If you pattern-match wrong and go down the wrong path, you just wasted a bunch of time and someone else wins. Competitive CTFs are all about walking the line between going as fast as possible and being very careful so you don't have to revisit, debug, and redo a bunch of your work. LLMs completely screw that up by brute forcing the process faster than humans.
This sucks.
@lina maybe it would work to take a page from my areas of expertise, locks and psychology. Make trap flags that lead AIs into false solutions that humans can identify and step out of, but that AI thinks is the right way forward.
Update: I tried about a dozen decoy flags, and ChatGPT was surprisingly good at picking out the correct one. The only ones where it failed were when the flag decoded into what looked like a valid flag, but it was an instruction to enter something else.
Like:
- CTF_3NT3RTH1SFL4GBKWDS
- CTF_F0110WD1R3CT10NS