@bms48 @downey

bsdphk@fosstodon.org

I have not been impressed by Coverity in Varnish/Vinyl context.

lapo@f.lapo.it

@downey @david_chisnall @bms48 I have the same fear. On the other hand, Firefox 150 apparently fixes 251 bugs, and I wonder how they did it.

For a hardened target, just one such bug would have been red-alert in 2025, and so many at once makes you stop to wonder whether it’s even possible to keep up.

david_chisnall@infosec.exchange

@bsdphk @bms48 @downey

I did some triaging of libc reports in FreeBSD from Coverity about ten years ago. The false positive rate was very high, but lower than I’ve seen for Claude.

We use the clang analyser in CI for CHERIoT RTOS (and our clang has a growing number of CHERI-specific analyses). It isn’t as good as Coverity in general but it has found real bugs in my code prior to merging PRs.

It’s much easier to use for a new project than an established one. Each time we turn on a new analysis there is a period of checking each report and adding comments to silence it if it’s a false positive, but it’s fairly short. A project that’s already millions of lines of code, going from nothing to all of the analyses, just has a huge pile of things to wade through.

lapo@f.lapo.it

@downey @david_chisnall @bms48

a code review of one of my projects done with Claude 4.6 (which, apparently, is as good at Mythos at finding bugs but less good at producing PoC exploits)

There is a huge difference there, though: a pipeline producing actual PoC exploits implicitly filter out all reports that are not actionable, so it produces far less false positives (if at all, depending on the internal validation done on the exploits).

bms48@mastodon.social

@david_chisnall @bsdphk @downey clang-cfi made it onto my C++ tools list yesterday AM when digging further already on hardware capabilities enforcement SoTA

david_chisnall@infosec.exchange

@lapo @downey @bms48

It’s not clear how many of those were serious and how much they were triaged before being handed to the Firefox team.

To put that number in perspective, there was a paper on FFI bugs a few years ago that found around 300 vulnerabilities in Chromium’s DOM to JavaScript bindings. That’s a single bug class in a single subsystem of a browser. Chromium since moved to more machine-generated code for this boundary and eliminated most of that bug class by construction.

bms48@mastodon.social

@david_chisnall @lapo @downey The claim that many of the claimed bugs Mythos found in Firefox required taking down the sandbox persists

david_chisnall@infosec.exchange

@lapo @downey @bms48

Yes, that’s kind-of true, but context matters. The reports I saw were for a library, so the context is callers of the library. One report, for example, was in a function that is called by compiler-generated code. It would crash if passed a null pointer, but the compiler will never pass it a null pointer. A fuzzing harness could easily generate a test case that passed it a null pointer. Adding a null check there would have impacted performance of the hottest code path in the entire library.

The Firefox reports that Anthropic made public weren’t in Firefox, they were in Spidermonkey running in a test harness. How many bugs were reachable by the test harness but not by Firefox? Especially since Spidermonkey runs in the sandboxed child process in Firefox, which is assumed according to the threat model to be compromised.

bms48@mastodon.social

@david_chisnall @lapo @downey Looks like we just synchronously converged on the same latter point... as for the former I'm looking at defensive use of modern C++ nullptr in concepts, contracts and other mechanisms. I still have the last High Integrity C++ draft to review but it is looking rather dated just now.

david_chisnall@infosec.exchange

@bms48 @lapo @downey

Be careful of contracts. They leave so much implementation defined (with soundness issues in the presence of multiple compilation units) that they are unsuitable for anything security related. We have explicitly banned their use in CHERIoT RTOS.

bms48@mastodon.social

@david_chisnall @lapo @downey Sweedack. Modules still strike me as knitting yoghurt. Kenton Varda's standing advice to eliminate singletons on site (for capability reasons) is very sound. https://kentonshouse.com/singletons

bms48@mastodon.social

@david_chisnall @lapo @downey @ludicity "Sweedack" is an oblique reference to John Brunner's seminal science fiction novel "The Shockwave Rider" which reads very differently in the now, and clearly inspired someone (swr) who worked at Entercept on their form of Domain & Type Enforcement (DTE), just before I interviewed there in 2001 as the dot-com crash was about to happen, when I'd had the mess of eTrust to deal with inside JPMorganChase as 3rd line security, adding net promisc logs to Solaris.

bms48@mastodon.social

@david_chisnall @lapo @downey @ludicity That would be the individual who assumed swr as his nom-de-plume: https://phrack.org/issues/56/4

CIRCLE WITH A DOT

@bms48 @downey