I’m seeing a lot of denial and logical fallacies on Mastodon about LLM capability to find security bugs.

hsivonen@mastodon.social

I get it that when folks have concluded that LLMs are harmful, they want to believe that LLMs fail at everything. But a list of correctly-identified bad things about LLMs does not logically imply that LLMs can’t find security bugs.

hsivonen@mastodon.social

And, yes, the Anthropic Mythos post fits a previously-seen pattern of “AI” companies marketing by danger, but saying that it’s marketing does not refute what the models that are already generally offered can do.

And people act like their own conjecture is more informative than what people from multiple projects that deal with security bug reports say. See e.g. https://mastodon.social/@bagder/116363034479757682 .

hsivonen@mastodon.social

Then there’s the dismissal that, yes, LLMs now find security bugs, but the bugs could have been found by other methods. But evidently defenders hadn’t actually found them by other methods. (Unknown what attackers had already found.)

Or folks find it objectionable that the new capability has been made available to attackers and the proposed cure is to pay for access to the same LLM. But that does make the existence of the capability untrue.

hsivonen@mastodon.social

Or folks go LOL at security incidents or code quality at an LLM company. Irrelevant to whether their model can find security bugs. The way this works is that you have a non-LLM oracle like ASAN. If the model found a way to trigger the oracle, then it’s not really productive to argue that it didn’t.

Why even post this considering the predictable hate? Because denial about the situation does not make users safer from attacks.

turre@mementomori.social

@hsivonen Well. When those companies have touted and pushed their AI thingies at a thousand things they're unsuited for, that kinda sets the expectations.

Most of us are just so bloody fucken tired of hearing AI AI AI AI everywhere. You tone it out or go crazy. And so even the one thing it might be actually good at goes missed because folks are no longer listening. It's all so fantastically stupid.

sayrer@mastodon.social

@hsivonen if you haven't run it on your own code, you're missing out. once you do that, it's hard to argue about it.

gabrielesvelto@mas.to

@hsivonen isn't fuzzing a number game though? LLMs are fuzzers backed by billions, they'll absolutely find something, but so would everything else given the same resources and no restrain on how to spend them, no matter how wasteful.

freddy@social.security.plumbing

@gabrielesvelto @hsivonen not really. Some bugs are truly hard to find with fuzzing and are more easily identified by seeing codesmell and trying to trace it back to user. Reading and remembering code is limited by brain power / will power. As sad as it is: LLMs scale better here.

hsivonen@mastodon.social

@freddy @gabrielesvelto Also, it looks to me that fuzzing requires more human setup of what part of code to fuzz and how to deal with stuff like checksums whereas reportedly LLMs can deal with less specific harnesses and figure out how to fill in checksums.

gabrielesvelto@mas.to

@hsivonen @freddy yeah, but we're talking resources here. How much fuzzing and analysis would a few billion $ buy? A few 10s of billions? Remember that the total capex behind these technologies over the past three years is now in the 13-digits range. Spend that money on anything and it will fly.

freddy@social.security.plumbing

@gabrielesvelto @hsivonen yep this is still largely subsidized by cheap inference and essentially free training (for the consumer). I don’t bet on it staying this cheap.

marshray@infosec.exchange

@hsivonen ”Quick, get the torches and pitchforks!
Someone suggested that LLMs could in some way be useful.”

CIRCLE WITH A DOT

I’m seeing a lot of denial and logical fallacies on Mastodon about LLM capability to find security bugs.