#Mythos finds a #curl vulnerability
-
My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing.
@bagder This suggests a fun exercise for someone interested in messing around with LLMs:
1. Put back all the curl security issues previously found by LLM tools by dropping the fix commits from history or otherwise obfuscating the revert.
2. Feed the re-vulnerabilized repo to a selection of models and see what are the cheapest ones (by memory, time and/or monetary cost) that can find, say, 50%/75%/100% of the issues found by the warehouse-scale "foundation models".
Feels like a large part of the current results should be doable with significantly smaller resources, because being trained on every tweet and reddit post and libgen book ever is not obviously related to the task.
-
#Mythos finds a #curl vulnerability
yes, as in singular one.
Mythos finds a curl vulnerability
yes, as in singular one. Back in April 2026 Anthropic caused a lot of media noise when they concluded that their new AI model Mythos is dangerously good at finding security flaws in source code. Apparently Mythos was so good at this that Anthropic would not release this model to the public yet but instead … Continue reading Mythos finds a curl vulnerability →
daniel.haxx.se (daniel.haxx.se)
@bagder great, so even the Linux Foundation are naming things after the ultimate evil of a famous franchise? (Final Fantasy in this instance.)
-
#Mythos finds a #curl vulnerability
yes, as in singular one.
Mythos finds a curl vulnerability
yes, as in singular one. Back in April 2026 Anthropic caused a lot of media noise when they concluded that their new AI model Mythos is dangerously good at finding security flaws in source code. Apparently Mythos was so good at this that Anthropic would not release this model to the public yet but instead … Continue reading Mythos finds a curl vulnerability →
daniel.haxx.se (daniel.haxx.se)
@bagder “On average, every single production source code line of curl has been written (and then rewritten) 4.14 times.”
curl is the ship of Theseus not once, not twice, but four times

-
@bagder How do you explain that Mythos found 271 bugs in Firefox, and counting, and only 1 in cURL. Is the Firefox code base 271 times larger?
-
@bagder from my talks with people who had been given access to mythos in their org, they say it does find things which current tools miss, but also overlooks cases which current tools catch. so, yeah, to me it is "mostly marketing" combined with general FUD
@km As far as I can tell:
- No one who has worked with raw Mythos output has ever written about it.
- No one who has written about it has ever used it.
They would much rather have @bagder writing about it because his opinion carries weight. That means he can’t have direct access. To give him access, they’d demand to gag him with an NDA, like everyone else who has access.
This technique of making readers mentally fill in the gaps between what is verifiable and what is claimed is genius marketing and really dishonest. But we have come to expect systematic and casual dishonesty from these companies.
-
@km As far as I can tell:
- No one who has worked with raw Mythos output has ever written about it.
- No one who has written about it has ever used it.
They would much rather have @bagder writing about it because his opinion carries weight. That means he can’t have direct access. To give him access, they’d demand to gag him with an NDA, like everyone else who has access.
This technique of making readers mentally fill in the gaps between what is verifiable and what is claimed is genius marketing and really dishonest. But we have come to expect systematic and casual dishonesty from these companies.
-
-
@km Yeah. I didn’t mean it personally. I wasn’t criticising what you said, I’m sorry if I sounded that way.
I was just pointing out this constant theme. The only thing that ever is made public is the fully-polished, human-vetted final result. They carefully hide all other details and the press don’t care.
-
-
@das_robin @oots @bagder maybe @firefoxnightly can comment on that
-
My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing.
@bagder

️this -
My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing.
@bagder it's all marketing. And any improvements are completely moot, as the actual *costs* to find that single bug were in the tens of thousands of dollars minimum. That's the MINIMUM known cost.
It would not surprise me if finding that one bug cost $75k, $100k, $200k of compute time. It's a pile of shit, hilariously inefficient slop that sometimes behaves as a fuzzer that occasionally finds a crumb. -
@bagder
At least it works. It would have been quite a disaster if it found zero.@alterelefant@mastodontech.de @bagder@mastodon.social Are you a machine?
Classifying finding a single vulnerability (1) as success and 0 as failure sure seems like it
The world is not black and white and the usefulness of LLMs for finding vulnerabilities IMO isn't either -
@alterelefant@mastodontech.de @bagder@mastodon.social Are you a machine?
Classifying finding a single vulnerability (1) as success and 0 as failure sure seems like it
The world is not black and white and the usefulness of LLMs for finding vulnerabilities IMO isn't either -
@bagder This suggests a fun exercise for someone interested in messing around with LLMs:
1. Put back all the curl security issues previously found by LLM tools by dropping the fix commits from history or otherwise obfuscating the revert.
2. Feed the re-vulnerabilized repo to a selection of models and see what are the cheapest ones (by memory, time and/or monetary cost) that can find, say, 50%/75%/100% of the issues found by the warehouse-scale "foundation models".
Feels like a large part of the current results should be doable with significantly smaller resources, because being trained on every tweet and reddit post and libgen book ever is not obviously related to the task.
llm tools found security issues in curl? doubt
-
@normis Normi, tu taču zini ka tas ir curl autors?
-
@das_robin @oots @bagder there was this blog post dismissing lots of the myth https://www.flyingpenguin.com/the-boy-that-cried-mythos-verification-is-collapsing-trust-in-anthropic/
-
#Mythos finds a #curl vulnerability
yes, as in singular one.
Mythos finds a curl vulnerability
yes, as in singular one. Back in April 2026 Anthropic caused a lot of media noise when they concluded that their new AI model Mythos is dangerously good at finding security flaws in source code. Apparently Mythos was so good at this that Anthropic would not release this model to the public yet but instead … Continue reading Mythos finds a curl vulnerability →
daniel.haxx.se (daniel.haxx.se)
-
#Mythos finds a #curl vulnerability
yes, as in singular one.
Mythos finds a curl vulnerability
yes, as in singular one. Back in April 2026 Anthropic caused a lot of media noise when they concluded that their new AI model Mythos is dangerously good at finding security flaws in source code. Apparently Mythos was so good at this that Anthropic would not release this model to the public yet but instead … Continue reading Mythos finds a curl vulnerability →
daniel.haxx.se (daniel.haxx.se)
@bagder This closely matches the experience Homebrew has also had with Mythos. Also one vulnerability found and in our case it was a pretty irrelevant one.
-
@das_robin @bagder
Yes, #Firefox is probably a few orders of magnitude more complex than #curl and definitely much bigger.Still, the blog post explicitly mentions "In addition to fixing the 271 bugs identified by Claude Mythos Preview in the 150 release, we’ve shipped more of these fixes in 149.0.2, 150.0.1, and 150.0.2.", so >270 attributed to #Mythos *alone*.
-
R relay@relay.infosec.exchange shared this topic