There's a lot of discourse on Twitter about people using LLMs to solve CTF challenges.

nathan@mastodon.e4b4.eu

@lina Ah I didn't consider that there would be a culture of hiding tools/methods. Yeah that's definitely incompatible with a post-LLM world.

This is a general trend with GenAI: the only way to earn legitimacy is either in person, or by publicizing the creative process. For a while already visual/music artists have had to either rely on their existing credibility, or share their creative process to establish their art's legitimacy. New anonymous art has sadly been made nearly worthless.

natty@astolfo.social

@lina@vt.social To be fair I'd argue this is strictly a people problem

I feel like this is the inherent nature of competition in places where cooperation would make much more sense

And this issue permeates so many areas that the world is more preoccupied with catching the people cheating the system instead of going "hey maybe this system could incentivize actually getting invested into the thing instead of being a pure so-called meritocracy "

shansterable@ohai.social

@lina
CTF = Capture the Flag, in case that helps anyone besides me

I try to do for initialisms and acronyms what alt text does for images.

Wikipedia: In computer security, Capture the Flag (CTF) is an exercise in which participants attempt to find text strings, called "flags", which are secretly hidden in purposefully vulnerable programs or websites

doragasu@mastodon.sdf.org

@lina I wonder if you can still design a challenge to be "LLM unfriendly" by changing the wording, just like those papers showing how an LLM aces problems like "river crossing", but if you change wording a bit, they just fail in weird and spectacular ways.

lina@vt.social

@doragasu Possibly? I might try removing all "hints" from one and trying again and seeing if it's any different. But that also affects human solvers... the hints are there to point you towards a website that explains the fundamentals of what's going on. The LLM didn't even read that, it just guessed from a filename and a comment and hulk smashed its way to guessing the general concept right with multiple attempts...

doragasu@mastodon.sdf.org

@lina In those papers trying to confuse LLMs, what was very effective IIRC, was adding data you don't need to use to the statement. The LLM tried to use all data you gave it to solve the problem and fail. Just like when a child is solving maths problems from a text book, all problems look similar so the child internalizes that you have to add two numbers and divide by the third one. Then you change the problem and the child fails because applies the same "formula".

doragasu@mastodon.sdf.org

@lina Like in here: https://arxiv.org/abs/2305.04388

doragasu@mastodon.sdf.org

@lina Or better this one: https://arxiv.org/abs/2410.05229

lina@vt.social

@grishka FYI your instance seems to have a very old display name cached for me (that it is using for mentions) ^{^;;}

lina@vt.social

@nathan I don't think there's necessarily a culture of hiding methods outright (though some of the more competitive teams might), but more like people build their own personal stash of scripts and things to build off of, and don't necessarily just outright post it on GitHub or whatever.

So like, "fucky stuff with QR codes" having showed up in CTF challenges more than once, I have a personal "do low level analysis and extended recovery of damaged QR codes" script.

lina@vt.social

@abacabadabacaba The thing is the solution isn't "the code". The solution is the process. You can have an LLM "solve" it for you, then rewrite the process and cheat that way. Yes the solution will often involve some bespoke scripts and tooling, but that's just part of it. The "aha moments", that you can't provide proof of.

grishka@friends.grishka.me

Hoshino Lina (星乃リナ) 🩵 3D Yuri Wedding 2026!!!, yeah I only automatically reload actors when I receive activities from them and more than 24 hours has passed since the previous reload. Now that you've sent me a reply, it did trigger that. Maybe I should do the same when fetching things like a post that someone boosted.

lina@vt.social

@grishka Yeah I think that name was possibly a year+ old ^^;;

sonic2k@oldbytes.space

@lina

AI is fast eradicating any learning activity.
In my current job, learning anything new is actively discouraged.

As was said to us "they only care about numbers on a dashboard".

I got to the position I am in, at the level at I am in, by being curious and very interested, in taking things apart, and figuring out how they work.

A LLM, which, in the eyes of a CEO means he can get rid of people like me, is the end of the road, we are all doomed.

lina@vt.social

@natty But the whole point of a for-fun(/prize) competition is to use the gamification to motivate people... that's kind of what games are?

You don't strictly need it, you can publish challenges to be solved for no points and no prize... but that demonstrably does not get as many people interested. Between people for whom that works, and the "I just want to win" people who would use LLMs, there are people who would be motivated to compete but not just self-study, and you lose those when the LLM cheaters come in.

lina@vt.social

@ahasty But at least a calculator is always right. I have no problem with people using tools that can be understood and are reliable/engineered.

The problem is LLMs are not that. They cannot be understood, they are black boxes that just brute force their way through things. So they are particularly and uniquely toxic in the harm they cause, compared to the tools we've had until now as part of the general industrial/technology revolution.

arclight@oldbytes.space

@shansterable @lina I had to look it up. The next most popular definition of CTF was Children's Tumor Foundation...

ahasty@techhub.social

@lina yes, they are a black box. If used as a way to help educate yourself there is value. When used as means to an end, you kill the pipeline of problem solving. Unfortunately the unwavering force of capitalism is almost always short sighted

curtmack@floss.social

@lina that's the worst part IMO. We get Claude through work and, all environmental and ethical issues aside, I just hate using it. Curating mounds of garbage output from the Screw It Up Faster Machine sucks. But it looks *great* in artificial evaluations with a concrete, machine-verifiable goal. And too many managers don't understand that real world programming isn't just passing a succession of concrete, machine-verifiable goals.

jce@infosec.exchange

@lina Already in 2022 for the "European Cyber Cup" CTF at least one of the top3 team had ChatGPT open before even checking what some of the challenges were about 🫠

CIRCLE WITH A DOT

There's a lot of discourse on Twitter about people using LLMs to solve CTF challenges.