I'm curious to know what people think about Anthropic's claim that Claude found 500 high-severity vulnerabilities in open-source packages.
-
I'm curious to know what people think about Anthropic's claim that Claude found 500 high-severity vulnerabilities in open-source packages. Has anyone confirmed that these vulns were indeed high-severity and hadn't been discovered before? Is this development as big a deal as Anthropic says? Any other critiques?
@dangoodin 500 is such a nice, round number. Very much like a number a human would pick at random. That alone makes it rather suspect.
-
I'm curious to know what people think about Anthropic's claim that Claude found 500 high-severity vulnerabilities in open-source packages. Has anyone confirmed that these vulns were indeed high-severity and hadn't been discovered before? Is this development as big a deal as Anthropic says? Any other critiques?
@dangoodin it would help if they included things like CVE numbers, Github pull requests to fix the issues etc. There's some specific examples in the post.. but they include no information to actually find the vulns and/or validate what they're claiming.
-
I'm curious to know what people think about Anthropic's claim that Claude found 500 high-severity vulnerabilities in open-source packages. Has anyone confirmed that these vulns were indeed high-severity and hadn't been discovered before? Is this development as big a deal as Anthropic says? Any other critiques?
@dangoodin zero question it's pure fantasy bullshit. They refuse to show their work, as usual. All they've got is a middling CGIF vulnerability that isn't, and claiming credit for "finding" a vulnerability in GhostScript because "hey this commit did a thing so they must have had a vulnerability!"
-
@dangoodin zero question it's pure fantasy bullshit. They refuse to show their work, as usual. All they've got is a middling CGIF vulnerability that isn't, and claiming credit for "finding" a vulnerability in GhostScript because "hey this commit did a thing so they must have had a vulnerability!"
@dangoodin if "this commit changed a thing to fix a bug" is the metric, well fuck, I've found over 100,000 'vulnerabilities' in the past year.
-
@dangoodin it would help if they included things like CVE numbers, Github pull requests to fix the issues etc. There's some specific examples in the post.. but they include no information to actually find the vulns and/or validate what they're claiming.
@GossiTheDog @dangoodin
This looks like the first one. -
@GossiTheDog @dangoodin
This looks like the first one. -
I'm curious to know what people think about Anthropic's claim that Claude found 500 high-severity vulnerabilities in open-source packages. Has anyone confirmed that these vulns were indeed high-severity and hadn't been discovered before? Is this development as big a deal as Anthropic says? Any other critiques?
@dangoodin Daniel Steinberg mentioned on FOSDEM 2026 - full covered test suite is the wall none of "AI" could climb. I guess npm may provide even more vulnerable packages 987654321

-
@GossiTheDog @dangoodin
For #3 there are a bunch of recent commits to the lzw code.These really seem like bugs that existing scanners should have found, especially strcat use (#2).
-
I'm curious to know what people think about Anthropic's claim that Claude found 500 high-severity vulnerabilities in open-source packages. Has anyone confirmed that these vulns were indeed high-severity and hadn't been discovered before? Is this development as big a deal as Anthropic says? Any other critiques?
@dangoodin I said it elsewhere, but what's missing in my view is the false positive rate. Ok, it found 500. Did it flag 500? 5,000? 5,000,000? That's an important data point.
-
@dangoodin if "this commit changed a thing to fix a bug" is the metric, well fuck, I've found over 100,000 'vulnerabilities' in the past year.
That's not what Antropic said. Antropic said the vulns were high-severity.
-
That's not what Antropic said. Antropic said the vulns were high-severity.
@dangoodin that is EXACTLY what Anthropic said. LITERALLY it is the FIRST "vulnerability" they bogusly claim to have found.
> Neither of these methods yielded any significant findings. Eventually, however, Claude took a different approach: reading the Git commit history. Claude quickly found a security-relevant commit, and commented:
-
I'm curious to know what people think about Anthropic's claim that Claude found 500 high-severity vulnerabilities in open-source packages. Has anyone confirmed that these vulns were indeed high-severity and hadn't been discovered before? Is this development as big a deal as Anthropic says? Any other critiques?
Thanks for all the responses. So far, projects I understand to have received reports include: Ghostscript, OpenSC, lzw, and CGIF. Are others known? Links to commits that fix the vulns also appreciated.
-
@dangoodin that is EXACTLY what Anthropic said. LITERALLY it is the FIRST "vulnerability" they bogusly claim to have found.
> Neither of these methods yielded any significant findings. Eventually, however, Claude took a different approach: reading the Git commit history. Claude quickly found a security-relevant commit, and commented:
Right, but the post doesn't say merely that the reports of the 500 vulns resulted in commits. It says all 500 were high-severity. If true, that would be significant, no?
-
@dangoodin that is EXACTLY what Anthropic said. LITERALLY it is the FIRST "vulnerability" they bogusly claim to have found.
> Neither of these methods yielded any significant findings. Eventually, however, Claude took a different approach: reading the Git commit history. Claude quickly found a security-relevant commit, and commented:
@dangoodin to which I said "hang the fuck on" and read a bit more. And hey look, it's in fonts... bounds checking...
Snyk Vulnerability Database | Snyk
Medium severity (7.8) Out-of-bounds Read in ghostscript-tools-fonts | CVE-2024-46956
Learn more about centos:10 with Snyk Open Source Vulnerability Database (security.snyk.io)
-
@dangoodin to which I said "hang the fuck on" and read a bit more. And hey look, it's in fonts... bounds checking...
Snyk Vulnerability Database | Snyk
Medium severity (7.8) Out-of-bounds Read in ghostscript-tools-fonts | CVE-2024-46956
Learn more about centos:10 with Snyk Open Source Vulnerability Database (security.snyk.io)
CVSS is 7.8, which is high, no? That would seem to support the Anthropic's claim. What's the significance of the vulns being in fonts . . . bounds checking?
-
I'm curious to know what people think about Anthropic's claim that Claude found 500 high-severity vulnerabilities in open-source packages. Has anyone confirmed that these vulns were indeed high-severity and hadn't been discovered before? Is this development as big a deal as Anthropic says? Any other critiques?
(on the flip side, curl ending their bug bounty program because of the flood of slop reports)
-
(on the flip side, curl ending their bug bounty program because of the flood of slop reports)
@cerement @dangoodin Exactly what I was going to point out.
-
CVSS is 7.8, which is high, no? That would seem to support the Anthropic's claim. What's the significance of the vulns being in fonts . . . bounds checking?
@dangoodin the significance is that by their own words, they didn't discover shit. Check the date on that CVE. But they're trying to claim dishonestly that their magical almost-to-AGI stochastic parrot totally discovered it.
It did not. Period. -
@dangoodin the significance is that by their own words, they didn't discover shit. Check the date on that CVE. But they're trying to claim dishonestly that their magical almost-to-AGI stochastic parrot totally discovered it.
It did not. Period.I'm not arguing with you. Sorry if it sounds like I am. I don't have the same technical background you do and am asking how the 7.8-severity vuln shouldn't be considered high severity because it involves fonts . . . bounds checking? I'm asking you to explain the reasoning behind your assessment as if I was a student in a security 101 class.
-
R relay@relay.an.exchange shared this topic
-
I'm not arguing with you. Sorry if it sounds like I am. I don't have the same technical background you do and am asking how the 7.8-severity vuln shouldn't be considered high severity because it involves fonts . . . bounds checking? I'm asking you to explain the reasoning behind your assessment as if I was a student in a security 101 class.
@dangoodin the tl;dr is basically that they are making the completely bogus claim that they 'discovered' a vulnerability, because they found the commit, which was specifically to fix the already disclosed vulnerability.
This is as insane as claiming to have shockingly discovered someone has a dog after they texted you pictures of them holding a puppy, asked you for name suggestions, set up IG and YT accounts for the puppy you subscribe to, and you hosted a puppy party at your house.