Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Hey cybersecurity geeks-- so it seems like anthropic now has really good exploit detection ability.

Hey cybersecurity geeks-- so it seems like anthropic now has really good exploit detection ability.

Scheduled Pinned Locked Moved Uncategorized
19 Posts 16 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • zachweinersmith@mastodon.socialZ zachweinersmith@mastodon.social

    Hey cybersecurity geeks-- so it seems like anthropic now has really good exploit detection ability. Do you think this makes offense or defense harder? Like, seems like everyone might have to go through a battery of automated checks before deploying stuff into the world.

    fivetonsflax@tilde.zoneF This user is from outside of this forum
    fivetonsflax@tilde.zoneF This user is from outside of this forum
    fivetonsflax@tilde.zone
    wrote last edited by
    #9

    @ZachWeinersmith All we've seen so far is a press release. With "AI" those should be taken cum grano salis.

    1 Reply Last reply
    0
    • zachweinersmith@mastodon.socialZ zachweinersmith@mastodon.social

      Hey cybersecurity geeks-- so it seems like anthropic now has really good exploit detection ability. Do you think this makes offense or defense harder? Like, seems like everyone might have to go through a battery of automated checks before deploying stuff into the world.

      max@smeap.comM This user is from outside of this forum
      max@smeap.comM This user is from outside of this forum
      max@smeap.com
      wrote last edited by
      #10

      @ZachWeinersmith I think it is going to make defense harder in the long run. A lot of these automated checks existed before but were well known to be hindsight-oriented (won’t find truly novel things, just the simplest variations of past mistakes) and imprecise (makes mistakes in both directions). These new ones give more illusions of novelty and precision. Devs will rely on them as their only investigation, not one in a stable full.

      max@smeap.comM 1 Reply Last reply
      0
      • max@smeap.comM max@smeap.com

        @ZachWeinersmith I think it is going to make defense harder in the long run. A lot of these automated checks existed before but were well known to be hindsight-oriented (won’t find truly novel things, just the simplest variations of past mistakes) and imprecise (makes mistakes in both directions). These new ones give more illusions of novelty and precision. Devs will rely on them as their only investigation, not one in a stable full.

        max@smeap.comM This user is from outside of this forum
        max@smeap.comM This user is from outside of this forum
        max@smeap.com
        wrote last edited by
        #11

        @ZachWeinersmith But these new tools are also going to be just expensive enough that not just are devs going to want to solely rely on them, business people are going to want to empty the rest of the stables, including the jobs of the people good at finding and solving the novel problems.

        max@smeap.comM 1 Reply Last reply
        0
        • max@smeap.comM max@smeap.com

          @ZachWeinersmith But these new tools are also going to be just expensive enough that not just are devs going to want to solely rely on them, business people are going to want to empty the rest of the stables, including the jobs of the people good at finding and solving the novel problems.

          max@smeap.comM This user is from outside of this forum
          max@smeap.comM This user is from outside of this forum
          max@smeap.com
          wrote last edited by
          #12

          @ZachWeinersmith Meanwhile in open source, this is just going to further tax and then kill bug bounty programs from amateurs willing to pay for these LLM tools in the hopes of easy cash. PRs get harder to prioritize without real people investing time in them. Any PRs that get ignored hard enough become free attack writeups for the less scrupulous.

          1 Reply Last reply
          0
          • zachweinersmith@mastodon.socialZ zachweinersmith@mastodon.social

            Hey cybersecurity geeks-- so it seems like anthropic now has really good exploit detection ability. Do you think this makes offense or defense harder? Like, seems like everyone might have to go through a battery of automated checks before deploying stuff into the world.

            orb2069@mastodon.onlineO This user is from outside of this forum
            orb2069@mastodon.onlineO This user is from outside of this forum
            orb2069@mastodon.online
            wrote last edited by
            #13

            @ZachWeinersmith

            Statistical models are inherently unreliable, but, "...remember we have only to be lucky once, you will have to be lucky always. "(*)

            But Anthropic will happily sell you all the attempts you want to see if it can find the bug somebody else will use to rock your shit before they do the same. Better buy some tokens!

            (* - The IRA reminding Thacher of a fundamental advantage of asymetrical warfare - https://en.wikipedia.org/wiki/Brighton_hotel_bombing )

            1 Reply Last reply
            0
            • zachweinersmith@mastodon.socialZ zachweinersmith@mastodon.social

              Hey cybersecurity geeks-- so it seems like anthropic now has really good exploit detection ability. Do you think this makes offense or defense harder? Like, seems like everyone might have to go through a battery of automated checks before deploying stuff into the world.

              schmidt_fu@mstdn.socialS This user is from outside of this forum
              schmidt_fu@mstdn.socialS This user is from outside of this forum
              schmidt_fu@mstdn.social
              wrote last edited by
              #14

              @ZachWeinersmith
              First of all, it's going to make Anthropic richer, because both sides will use it more.
              But hear me out: If their detection ability really was so good - why did the #ClaudeCodeLeak immediately result in several high-profile vulnerabilities found? E.g.
              https://phoenix.security/claude-code-leak-to-vulnerability-three-cves-in-claude-code-cli-and-the-chain-that-connects-them/
              #InsecureAI #Infosec #ClaudeCode

              1 Reply Last reply
              0
              • zachweinersmith@mastodon.socialZ zachweinersmith@mastodon.social

                Hey cybersecurity geeks-- so it seems like anthropic now has really good exploit detection ability. Do you think this makes offense or defense harder? Like, seems like everyone might have to go through a battery of automated checks before deploying stuff into the world.

                david_chisnall@infosec.exchangeD This user is from outside of this forum
                david_chisnall@infosec.exchangeD This user is from outside of this forum
                david_chisnall@infosec.exchange
                wrote last edited by
                #15

                @ZachWeinersmith

                The original Coverity paper found over 300 bugs, most of which had security implications. Static analysis has been great at finding exploitable vulnerabilities for a long time. This is a new approach to doing static analysis.

                The biggest problem is always the false positive rate. If you run a tool and it finds a load of vulnerabilities, that’s great. Except you run the same tool and it also finds a load of things that look like vulnerabilities, but aren’t. So now you have to triage them and that takes effort. You also need to add annotations to silence the ones that aren’t real. With deterministic analysers, you can often provide some extra information (e.g. parameter attributes) that allow this information to be tracked across an analysis boundary. BCMC has a lot of these. But with a probabilistic tool, these may or may not work. So you’re left with just slapping on an annotation that says ‘ignore the warning here’. The bug I found a little while ago in some MISRA C code was of that form: their analyser had found it, someone had determined it was not a bug, and they were wrong.

                For a defender, if you spend too much time looking at and discounting false positives, you can improve code quality better with something else. I’ve only looked at a few of the bugs Claude reported, but one was a missing bounds check that wasn’t actually a vulnerability because the bounds were checked in the caller. Its fix made things slower, but not less exploitable. A good static analyser would have had a tool for annotating the function parameter to say ‘this is always at least n bytes’ and then checked that callers did this check. Claude has nothing like this because it doesn’t actually have a model of how code executes, it just has a set of probabilities for what exploitable code looks like. Unfortunately (and this is one of the problems with C), correct and vulnerable code can look exactly the same with different call stacks.

                The second problem is the asymmetry. To be secure, you need to investigate and fix all of the vulnerabilities that tools can find. For an attacker, you just need one vulnerability. The ROI for attackers is much higher. Imagine a tool with a 90% false positive rate that finds 1,000 vulnerability-shaped objects. An attacker who triages 6-7 of them has around a 50% chance of finding an attack that they can use. A defender who does the same amount of work has a 50% chance of reducing the number of vulnerabilities discoverable by attackers using this or similar tools by 1%.

                This is why I build things that deterministically prevent classes of vulnerabilities from being exploitable.

                P 1 Reply Last reply
                1
                0
                • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

                  @ZachWeinersmith

                  The original Coverity paper found over 300 bugs, most of which had security implications. Static analysis has been great at finding exploitable vulnerabilities for a long time. This is a new approach to doing static analysis.

                  The biggest problem is always the false positive rate. If you run a tool and it finds a load of vulnerabilities, that’s great. Except you run the same tool and it also finds a load of things that look like vulnerabilities, but aren’t. So now you have to triage them and that takes effort. You also need to add annotations to silence the ones that aren’t real. With deterministic analysers, you can often provide some extra information (e.g. parameter attributes) that allow this information to be tracked across an analysis boundary. BCMC has a lot of these. But with a probabilistic tool, these may or may not work. So you’re left with just slapping on an annotation that says ‘ignore the warning here’. The bug I found a little while ago in some MISRA C code was of that form: their analyser had found it, someone had determined it was not a bug, and they were wrong.

                  For a defender, if you spend too much time looking at and discounting false positives, you can improve code quality better with something else. I’ve only looked at a few of the bugs Claude reported, but one was a missing bounds check that wasn’t actually a vulnerability because the bounds were checked in the caller. Its fix made things slower, but not less exploitable. A good static analyser would have had a tool for annotating the function parameter to say ‘this is always at least n bytes’ and then checked that callers did this check. Claude has nothing like this because it doesn’t actually have a model of how code executes, it just has a set of probabilities for what exploitable code looks like. Unfortunately (and this is one of the problems with C), correct and vulnerable code can look exactly the same with different call stacks.

                  The second problem is the asymmetry. To be secure, you need to investigate and fix all of the vulnerabilities that tools can find. For an attacker, you just need one vulnerability. The ROI for attackers is much higher. Imagine a tool with a 90% false positive rate that finds 1,000 vulnerability-shaped objects. An attacker who triages 6-7 of them has around a 50% chance of finding an attack that they can use. A defender who does the same amount of work has a 50% chance of reducing the number of vulnerabilities discoverable by attackers using this or similar tools by 1%.

                  This is why I build things that deterministically prevent classes of vulnerabilities from being exploitable.

                  P This user is from outside of this forum
                  P This user is from outside of this forum
                  paulf@mastodon.bsd.cafe
                  wrote last edited by
                  #16

                  @david_chisnall @ZachWeinersmith

                  "The bug I found a little while ago in some MISRA C code was of that form: their analyser had found it, someone had determined it was not a bug, and they were wrong."

                  It's not just static analysis. Valgrind memcheck has a low false positive rate. For some reason many people seem to believe that if their program does not crash every time on their machine then it must be infallibly and absolutely correct. They might then report a "bug" or seek confirmation of the "false positive" that they have found.

                  david_chisnall@infosec.exchangeD 1 Reply Last reply
                  0
                  • P paulf@mastodon.bsd.cafe

                    @david_chisnall @ZachWeinersmith

                    "The bug I found a little while ago in some MISRA C code was of that form: their analyser had found it, someone had determined it was not a bug, and they were wrong."

                    It's not just static analysis. Valgrind memcheck has a low false positive rate. For some reason many people seem to believe that if their program does not crash every time on their machine then it must be infallibly and absolutely correct. They might then report a "bug" or seek confirmation of the "false positive" that they have found.

                    david_chisnall@infosec.exchangeD This user is from outside of this forum
                    david_chisnall@infosec.exchangeD This user is from outside of this forum
                    david_chisnall@infosec.exchange
                    wrote last edited by
                    #17

                    @paulf @ZachWeinersmith

                    I didn't talk about dynamic analysis, but it has a bunch of different tradeoffs.

                    In general, dynamic analysis (valgrind, sanitisers, and so on) has a very low false positive rate because every code path that it sees really is a code path that is reachable in a program run. At the same time, it also has a higher false negative rate. Most security vulnerabilities come from a case where an attacker provides some unusual input. Dynamic analysis tools will often only ever see the behaviour of the program with expected (not necessarily correct) inputs.

                    The combination of fuzzing (provide a load of different unexpected inputs, with some feedback to try to find corner cases in execution) works nicely, but also hits combinatorial problems. Even if you have 100% line coverage, some bugs manifest only if lines are hit in a specific order, or even if two threads do the same thing at the same time. These approaches can never tell you that bugs aren't present, only that they are.

                    TL;DR: Dynamic analysis can be sound but not complete. Static analysis can be complete but not sound.

                    malwareminigun@infosec.exchangeM 1 Reply Last reply
                    0
                    • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

                      @paulf @ZachWeinersmith

                      I didn't talk about dynamic analysis, but it has a bunch of different tradeoffs.

                      In general, dynamic analysis (valgrind, sanitisers, and so on) has a very low false positive rate because every code path that it sees really is a code path that is reachable in a program run. At the same time, it also has a higher false negative rate. Most security vulnerabilities come from a case where an attacker provides some unusual input. Dynamic analysis tools will often only ever see the behaviour of the program with expected (not necessarily correct) inputs.

                      The combination of fuzzing (provide a load of different unexpected inputs, with some feedback to try to find corner cases in execution) works nicely, but also hits combinatorial problems. Even if you have 100% line coverage, some bugs manifest only if lines are hit in a specific order, or even if two threads do the same thing at the same time. These approaches can never tell you that bugs aren't present, only that they are.

                      TL;DR: Dynamic analysis can be sound but not complete. Static analysis can be complete but not sound.

                      malwareminigun@infosec.exchangeM This user is from outside of this forum
                      malwareminigun@infosec.exchangeM This user is from outside of this forum
                      malwareminigun@infosec.exchange
                      wrote last edited by
                      #18

                      @david_chisnall I just squint and say "Gödel's incompleteness theorems" whenever thinking about these

                      1 Reply Last reply
                      0
                      • zachweinersmith@mastodon.socialZ zachweinersmith@mastodon.social

                        Hey cybersecurity geeks-- so it seems like anthropic now has really good exploit detection ability. Do you think this makes offense or defense harder? Like, seems like everyone might have to go through a battery of automated checks before deploying stuff into the world.

                        T This user is from outside of this forum
                        T This user is from outside of this forum
                        trademark@fosstodon.org
                        wrote last edited by
                        #19

                        @ZachWeinersmith Depends, it will favour defense in the case of high-value targets with a well-resourced, competent security-team. E.g. Apple defending their iPhones should be able to address issues better than today. Targets which today are secure because nobody has yet gotten around to looking are going to be in trouble...

                        1 Reply Last reply
                        0
                        • R relay@relay.infosec.exchange shared this topic
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups