Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. @bms48 @downey

@bms48 @downey

Scheduled Pinned Locked Moved Uncategorized
14 Posts 4 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • bsdphk@fosstodon.orgB bsdphk@fosstodon.org

    @david_chisnall @bms48 @downey

    I have not been impressed by Coverity in Varnish/Vinyl context.

    david_chisnall@infosec.exchangeD This user is from outside of this forum
    david_chisnall@infosec.exchangeD This user is from outside of this forum
    david_chisnall@infosec.exchange
    wrote last edited by
    #4

    @bsdphk @bms48 @downey

    I did some triaging of libc reports in FreeBSD from Coverity about ten years ago. The false positive rate was very high, but lower than I’ve seen for Claude.

    We use the clang analyser in CI for CHERIoT RTOS (and our clang has a growing number of CHERI-specific analyses). It isn’t as good as Coverity in general but it has found real bugs in my code prior to merging PRs.

    It’s much easier to use for a new project than an established one. Each time we turn on a new analysis there is a period of checking each report and adding comments to silence it if it’s a false positive, but it’s fairly short. A project that’s already millions of lines of code, going from nothing to all of the analyses, just has a huge pile of things to wade through.

    bms48@mastodon.socialB 1 Reply Last reply
    0
    • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

      @bms48 @downey

      The thing that’s hidden when projects get reports from Anthropic is how much human triage is needed.

      I had someone send me a code review of one of my projects done with Claude 4.6 (which, apparently, is as good at Mythos at finding bugs but less good at producing PoC exploits). Of the top ten bugs, most were not bugs (e.g. missing null checks on things where the API contract requires non-null arguments). Two were intentional design choices and the proposed changes would have made things slower. One was a bug that needed fixing, but there was already an open PR to fix it before Claude looked at the project.

      The signal to noise ratio is worse than Coverity, and FreeBSD hasn’t had the resources to triage / fix all of the issues the free Coverity scans found in 15 or so years of having access to it.

      lapo@f.lapo.itL This user is from outside of this forum
      lapo@f.lapo.itL This user is from outside of this forum
      lapo@f.lapo.it
      wrote last edited by
      #5

      @downey @david_chisnall @bms48

      a code review of one of my projects done with Claude 4.6 (which, apparently, is as good at Mythos at finding bugs but less good at producing PoC exploits)


      There is a huge difference there, though: a pipeline producing actual PoC exploits implicitly filter out all reports that are not actionable, so it produces far less false positives (if at all, depending on the internal validation done on the exploits).

      david_chisnall@infosec.exchangeD 1 Reply Last reply
      0
      • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

        @bsdphk @bms48 @downey

        I did some triaging of libc reports in FreeBSD from Coverity about ten years ago. The false positive rate was very high, but lower than I’ve seen for Claude.

        We use the clang analyser in CI for CHERIoT RTOS (and our clang has a growing number of CHERI-specific analyses). It isn’t as good as Coverity in general but it has found real bugs in my code prior to merging PRs.

        It’s much easier to use for a new project than an established one. Each time we turn on a new analysis there is a period of checking each report and adding comments to silence it if it’s a false positive, but it’s fairly short. A project that’s already millions of lines of code, going from nothing to all of the analyses, just has a huge pile of things to wade through.

        bms48@mastodon.socialB This user is from outside of this forum
        bms48@mastodon.socialB This user is from outside of this forum
        bms48@mastodon.social
        wrote last edited by
        #6

        @david_chisnall @bsdphk @downey clang-cfi made it onto my C++ tools list yesterday AM when digging further already on hardware capabilities enforcement SoTA

        1 Reply Last reply
        0
        • lapo@f.lapo.itL lapo@f.lapo.it

          @downey @david_chisnall @bms48 I have the same fear. On the other hand, Firefox 150 apparently fixes 251 bugs, and I wonder how they did it.

          For a hardened target, just one such bug would have been red-alert in 2025, and so many at once makes you stop to wonder whether it’s even possible to keep up.
          david_chisnall@infosec.exchangeD This user is from outside of this forum
          david_chisnall@infosec.exchangeD This user is from outside of this forum
          david_chisnall@infosec.exchange
          wrote last edited by
          #7

          @lapo @downey @bms48

          It’s not clear how many of those were serious and how much they were triaged before being handed to the Firefox team.

          To put that number in perspective, there was a paper on FFI bugs a few years ago that found around 300 vulnerabilities in Chromium’s DOM to JavaScript bindings. That’s a single bug class in a single subsystem of a browser. Chromium since moved to more machine-generated code for this boundary and eliminated most of that bug class by construction.

          bms48@mastodon.socialB 1 Reply Last reply
          0
          • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

            @lapo @downey @bms48

            It’s not clear how many of those were serious and how much they were triaged before being handed to the Firefox team.

            To put that number in perspective, there was a paper on FFI bugs a few years ago that found around 300 vulnerabilities in Chromium’s DOM to JavaScript bindings. That’s a single bug class in a single subsystem of a browser. Chromium since moved to more machine-generated code for this boundary and eliminated most of that bug class by construction.

            bms48@mastodon.socialB This user is from outside of this forum
            bms48@mastodon.socialB This user is from outside of this forum
            bms48@mastodon.social
            wrote last edited by
            #8

            @david_chisnall @lapo @downey The claim that many of the claimed bugs Mythos found in Firefox required taking down the sandbox persists

            1 Reply Last reply
            0
            • System shared this topic
            • lapo@f.lapo.itL lapo@f.lapo.it

              @downey @david_chisnall @bms48

              a code review of one of my projects done with Claude 4.6 (which, apparently, is as good at Mythos at finding bugs but less good at producing PoC exploits)


              There is a huge difference there, though: a pipeline producing actual PoC exploits implicitly filter out all reports that are not actionable, so it produces far less false positives (if at all, depending on the internal validation done on the exploits).

              david_chisnall@infosec.exchangeD This user is from outside of this forum
              david_chisnall@infosec.exchangeD This user is from outside of this forum
              david_chisnall@infosec.exchange
              wrote last edited by
              #9

              @lapo @downey @bms48

              Yes, that’s kind-of true, but context matters. The reports I saw were for a library, so the context is callers of the library. One report, for example, was in a function that is called by compiler-generated code. It would crash if passed a null pointer, but the compiler will never pass it a null pointer. A fuzzing harness could easily generate a test case that passed it a null pointer. Adding a null check there would have impacted performance of the hottest code path in the entire library.

              The Firefox reports that Anthropic made public weren’t in Firefox, they were in Spidermonkey running in a test harness. How many bugs were reachable by the test harness but not by Firefox? Especially since Spidermonkey runs in the sandboxed child process in Firefox, which is assumed according to the threat model to be compromised.

              bms48@mastodon.socialB 1 Reply Last reply
              0
              • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

                @lapo @downey @bms48

                Yes, that’s kind-of true, but context matters. The reports I saw were for a library, so the context is callers of the library. One report, for example, was in a function that is called by compiler-generated code. It would crash if passed a null pointer, but the compiler will never pass it a null pointer. A fuzzing harness could easily generate a test case that passed it a null pointer. Adding a null check there would have impacted performance of the hottest code path in the entire library.

                The Firefox reports that Anthropic made public weren’t in Firefox, they were in Spidermonkey running in a test harness. How many bugs were reachable by the test harness but not by Firefox? Especially since Spidermonkey runs in the sandboxed child process in Firefox, which is assumed according to the threat model to be compromised.

                bms48@mastodon.socialB This user is from outside of this forum
                bms48@mastodon.socialB This user is from outside of this forum
                bms48@mastodon.social
                wrote last edited by
                #10

                @david_chisnall @lapo @downey Looks like we just synchronously converged on the same latter point... as for the former I'm looking at defensive use of modern C++ nullptr in concepts, contracts and other mechanisms. I still have the last High Integrity C++ draft to review but it is looking rather dated just now.

                david_chisnall@infosec.exchangeD 1 Reply Last reply
                0
                • bms48@mastodon.socialB bms48@mastodon.social

                  @david_chisnall @lapo @downey Looks like we just synchronously converged on the same latter point... as for the former I'm looking at defensive use of modern C++ nullptr in concepts, contracts and other mechanisms. I still have the last High Integrity C++ draft to review but it is looking rather dated just now.

                  david_chisnall@infosec.exchangeD This user is from outside of this forum
                  david_chisnall@infosec.exchangeD This user is from outside of this forum
                  david_chisnall@infosec.exchange
                  wrote last edited by
                  #11

                  @bms48 @lapo @downey

                  Be careful of contracts. They leave so much implementation defined (with soundness issues in the presence of multiple compilation units) that they are unsuitable for anything security related. We have explicitly banned their use in CHERIoT RTOS.

                  bms48@mastodon.socialB 1 Reply Last reply
                  0
                  • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

                    @bms48 @lapo @downey

                    Be careful of contracts. They leave so much implementation defined (with soundness issues in the presence of multiple compilation units) that they are unsuitable for anything security related. We have explicitly banned their use in CHERIoT RTOS.

                    bms48@mastodon.socialB This user is from outside of this forum
                    bms48@mastodon.socialB This user is from outside of this forum
                    bms48@mastodon.social
                    wrote last edited by
                    #12

                    @david_chisnall @lapo @downey Sweedack. Modules still strike me as knitting yoghurt. Kenton Varda's standing advice to eliminate singletons on site (for capability reasons) is very sound. https://kentonshouse.com/singletons

                    bms48@mastodon.socialB 1 Reply Last reply
                    0
                    • bms48@mastodon.socialB bms48@mastodon.social

                      @david_chisnall @lapo @downey Sweedack. Modules still strike me as knitting yoghurt. Kenton Varda's standing advice to eliminate singletons on site (for capability reasons) is very sound. https://kentonshouse.com/singletons

                      bms48@mastodon.socialB This user is from outside of this forum
                      bms48@mastodon.socialB This user is from outside of this forum
                      bms48@mastodon.social
                      wrote last edited by
                      #13

                      @david_chisnall @lapo @downey @ludicity "Sweedack" is an oblique reference to John Brunner's seminal science fiction novel "The Shockwave Rider" which reads very differently in the now, and clearly inspired someone (swr) who worked at Entercept on their form of Domain & Type Enforcement (DTE), just before I interviewed there in 2001 as the dot-com crash was about to happen, when I'd had the mess of eTrust to deal with inside JPMorganChase as 3rd line security, adding net promisc logs to Solaris.

                      bms48@mastodon.socialB 1 Reply Last reply
                      0
                      • bms48@mastodon.socialB bms48@mastodon.social

                        @david_chisnall @lapo @downey @ludicity "Sweedack" is an oblique reference to John Brunner's seminal science fiction novel "The Shockwave Rider" which reads very differently in the now, and clearly inspired someone (swr) who worked at Entercept on their form of Domain & Type Enforcement (DTE), just before I interviewed there in 2001 as the dot-com crash was about to happen, when I'd had the mess of eTrust to deal with inside JPMorganChase as 3rd line security, adding net promisc logs to Solaris.

                        bms48@mastodon.socialB This user is from outside of this forum
                        bms48@mastodon.socialB This user is from outside of this forum
                        bms48@mastodon.social
                        wrote last edited by
                        #14

                        @david_chisnall @lapo @downey @ludicity That would be the individual who assumed swr as his nom-de-plume: https://phrack.org/issues/56/4

                        1 Reply Last reply
                        0
                        • R relay@relay.publicsquare.global shared this topic
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups