Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Claude code source "leaks" in a mapfile

Claude code source "leaks" in a mapfile

Scheduled Pinned Locked Moved Uncategorized
43 Posts 4 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • jonny@neuromatch.socialJ jonny@neuromatch.social

    If you are reading an image and near your estimated token limit, first try to compressImageBufferWithTokenLimit, then if that fails with any kind of error, try and use sharp directly and resize it to 400x400, cropping. finally, fuck it, just throw the buffer at the API.

    of course compressImageBufferWithTokenLimit is also compression with sharp, and is also a series of fallback operations. We start by trying to detect the image encoding that we so painstakingly learned from... the file extension... but if we can't fuck it that shit is a jpeg now.

    then, even if it's fine and we don't need to do anything, we still re-compress it (wait, no even though it's named createCompressedImageResult, it does nothing). Otherwise, we yolo our way through another layer of fallbacks, progressive resizing, palletized PNGs, back to JPEG again, and then on to "ultra compressed JPEG" which is... incredibly... exactly the same as the top-level in-place code in the parent function

    while two of the legs return a createImageReponse, the first leg returns a compressedImageResponse but then unpacks that back into an object literal that's almost exactly the same except we call it type instead of mediaType.

    jonny@neuromatch.socialJ This user is from outside of this forum
    jonny@neuromatch.socialJ This user is from outside of this forum
    jonny@neuromatch.social
    wrote last edited by
    #23

    for those keeping score at home, we have the opportunity to re-compress the same image nine times

    jonny@neuromatch.socialJ 1 Reply Last reply
    0
    • jonny@neuromatch.socialJ jonny@neuromatch.social

      for those keeping score at home, we have the opportunity to re-compress the same image nine times

      jonny@neuromatch.socialJ This user is from outside of this forum
      jonny@neuromatch.socialJ This user is from outside of this forum
      jonny@neuromatch.social
      wrote last edited by
      #24

      holy shit there's another entire fallback tree before this one, that's actually an astounding twenty two times it's possible to compress an image across nine independent conditional legs of code in a single api call. i can't even screenshot this, the spaghetti is too powerful

      jonny@neuromatch.socialJ 1 Reply Last reply
      1
      0
      • jonny@neuromatch.socialJ jonny@neuromatch.social

        holy shit there's another entire fallback tree before this one, that's actually an astounding twenty two times it's possible to compress an image across nine independent conditional legs of code in a single api call. i can't even screenshot this, the spaghetti is too powerful

        jonny@neuromatch.socialJ This user is from outside of this forum
        jonny@neuromatch.socialJ This user is from outside of this forum
        jonny@neuromatch.social
        wrote last edited by
        #25

        here, if i fold all the return blocks and decrease my font size as small as it goes i can fit all the compression invocations in the first of three top-level compression fallback trees in a single screenshot, but since it is so small i just have to circle them in red like it's a football diagram.

        this function is named "maybeResizeAndDownsampleImageBuffer" and boy that is a hell of a maybe!

        jonny@neuromatch.socialJ 1 Reply Last reply
        1
        0
        • jonny@neuromatch.socialJ jonny@neuromatch.social

          here, if i fold all the return blocks and decrease my font size as small as it goes i can fit all the compression invocations in the first of three top-level compression fallback trees in a single screenshot, but since it is so small i just have to circle them in red like it's a football diagram.

          this function is named "maybeResizeAndDownsampleImageBuffer" and boy that is a hell of a maybe!

          jonny@neuromatch.socialJ This user is from outside of this forum
          jonny@neuromatch.socialJ This user is from outside of this forum
          jonny@neuromatch.social
          wrote last edited by
          #26

          and what if i told you that if it passes a page range to its pdf reader, it first extracts those pages to separate images and then calls this function in a loop on each of the pages. so you have the privilege of compressing n_pages images n_pages * 13 times.

          this function is used 13 times: in the file reader, in the mcp result handler, in the bash tool, and in the clipboard handler - each of which has their entire own surrounding image handling routines that are each hundreds of lines of similar but still very different fallback code to do exactly the same thing.

          so that's where all the five hundred thousand lines come from - fallback conditions and then more fallback conditions to compensate for the variable output of all the other fallback conditions. thirteen butts pooping, back and forth, forever.

          Link Preview Image
          jonny@neuromatch.socialJ 1 Reply Last reply
          1
          0
          • jonny@neuromatch.socialJ jonny@neuromatch.social

            and what if i told you that if it passes a page range to its pdf reader, it first extracts those pages to separate images and then calls this function in a loop on each of the pages. so you have the privilege of compressing n_pages images n_pages * 13 times.

            this function is used 13 times: in the file reader, in the mcp result handler, in the bash tool, and in the clipboard handler - each of which has their entire own surrounding image handling routines that are each hundreds of lines of similar but still very different fallback code to do exactly the same thing.

            so that's where all the five hundred thousand lines come from - fallback conditions and then more fallback conditions to compensate for the variable output of all the other fallback conditions. thirteen butts pooping, back and forth, forever.

            Link Preview Image
            jonny@neuromatch.socialJ This user is from outside of this forum
            jonny@neuromatch.socialJ This user is from outside of this forum
            jonny@neuromatch.social
            wrote last edited by
            #27

            there is a callback feature "file read listeners" which is only called if the file type is a text document, gated for anthropic employees only, such that whenever a text file is read (any part of any text file, which often happens in a rapid series with subranges when it does 'explore' mode, rather than just like grepping), another subagent running sonnet is spun off to update a "magic doc" markdown file that summarizes the file that's read - that's one "magic doc" per file, not one magic doc.

            I have yet to get into the tool/agent graph situation in earnest, but keep in mind that this is an entirely single-use and completely different means of spawning a graph of subagents off a given tool call than is used anywhere else.

            Spoiler alert for what i'm gonna check out next is that claude code has no fucking tool calling execution model it just calls whatever the fuck it wants wherever the fuck it wants. Tools are or less a convenient fiction. I have only read one completely (file read) and skimmed a dozen more but they essentially share nothing in common except for a humongous list of often-single-use params and the return type of "any object with a single key and whatever else"

            i'm in hell. this is hell.

            jonny@neuromatch.socialJ 1 Reply Last reply
            0
            • jonny@neuromatch.socialJ jonny@neuromatch.social

              there is a callback feature "file read listeners" which is only called if the file type is a text document, gated for anthropic employees only, such that whenever a text file is read (any part of any text file, which often happens in a rapid series with subranges when it does 'explore' mode, rather than just like grepping), another subagent running sonnet is spun off to update a "magic doc" markdown file that summarizes the file that's read - that's one "magic doc" per file, not one magic doc.

              I have yet to get into the tool/agent graph situation in earnest, but keep in mind that this is an entirely single-use and completely different means of spawning a graph of subagents off a given tool call than is used anywhere else.

              Spoiler alert for what i'm gonna check out next is that claude code has no fucking tool calling execution model it just calls whatever the fuck it wants wherever the fuck it wants. Tools are or less a convenient fiction. I have only read one completely (file read) and skimmed a dozen more but they essentially share nothing in common except for a humongous list of often-single-use params and the return type of "any object with a single key and whatever else"

              i'm in hell. this is hell.

              jonny@neuromatch.socialJ This user is from outside of this forum
              jonny@neuromatch.socialJ This user is from outside of this forum
              jonny@neuromatch.social
              wrote last edited by
              #28

              i have been writing a graph processing library for about a year now and if i was a fucking AI grifter here is where i would plug it as like "actually a graph processor library" and "could do all of what claude code does without fucking being the worst nightmare on ice money can buy."

              I say that not as self promo, but as a way of saying how in the FUCK do you FUCK UP graph processing this badly. these people make like tens of times more money than i do but their work is just tamping down a volley of dessicated backpacking poops into muskets and then free firing it into the fucking economy

              jonny@neuromatch.socialJ 1 Reply Last reply
              0
              • jonny@neuromatch.socialJ jonny@neuromatch.social

                i have been writing a graph processing library for about a year now and if i was a fucking AI grifter here is where i would plug it as like "actually a graph processor library" and "could do all of what claude code does without fucking being the worst nightmare on ice money can buy."

                I say that not as self promo, but as a way of saying how in the FUCK do you FUCK UP graph processing this badly. these people make like tens of times more money than i do but their work is just tamping down a volley of dessicated backpacking poops into muskets and then free firing it into the fucking economy

                jonny@neuromatch.socialJ This user is from outside of this forum
                jonny@neuromatch.socialJ This user is from outside of this forum
                jonny@neuromatch.social
                wrote last edited by
                #29

                you can TELL that this technology REALLY WORKS by how the people that made it and presumably know how to use it the best out of everyone CANT EVEN USE IT TO EDIT A FUCKING FILE RELIABLY and have to resort to multiple stern allcaps reminders to the robot that "you must not change the fucking header metadata you scoundrel" which for the rest of ALL OF COMPUTING is not even an afterthought because literally all it requires is "split the first line off and don't change that one" because ALL OF THE REST OF COMPUTING can make use of the power of INTEGERS.

                Link Preview ImageLink Preview Image
                jonny@neuromatch.socialJ 1 Reply Last reply
                0
                • jonny@neuromatch.socialJ jonny@neuromatch.social

                  you can TELL that this technology REALLY WORKS by how the people that made it and presumably know how to use it the best out of everyone CANT EVEN USE IT TO EDIT A FUCKING FILE RELIABLY and have to resort to multiple stern allcaps reminders to the robot that "you must not change the fucking header metadata you scoundrel" which for the rest of ALL OF COMPUTING is not even an afterthought because literally all it requires is "split the first line off and don't change that one" because ALL OF THE REST OF COMPUTING can make use of the power of INTEGERS.

                  Link Preview ImageLink Preview Image
                  jonny@neuromatch.socialJ This user is from outside of this forum
                  jonny@neuromatch.socialJ This user is from outside of this forum
                  jonny@neuromatch.social
                  wrote last edited by
                  #30

                  alrighty so that's one of 43 tools read, the tools directory being 38494 source lines out of 390592 source lines, 513221 total lines. I need to go to bed. This is the most fabulously, flamboyantly bad code i have ever encountered.

                  Worth noting I was reading the file reading tool because i thought it would be the simplest possible thing one could do because it basically shouldn't be doing anything except preparing and sending strings or bytes to the backend.

                  I expected to get some sense of "ok what is the format of the data as it's passed around within the program, surely text strings are a basic unit of currency. No dice. Fewer than no dice. Negative dice somehow.

                  jonny@neuromatch.socialJ 1 Reply Last reply
                  0
                  • jonny@neuromatch.socialJ jonny@neuromatch.social

                    alrighty so that's one of 43 tools read, the tools directory being 38494 source lines out of 390592 source lines, 513221 total lines. I need to go to bed. This is the most fabulously, flamboyantly bad code i have ever encountered.

                    Worth noting I was reading the file reading tool because i thought it would be the simplest possible thing one could do because it basically shouldn't be doing anything except preparing and sending strings or bytes to the backend.

                    I expected to get some sense of "ok what is the format of the data as it's passed around within the program, surely text strings are a basic unit of currency. No dice. Fewer than no dice. Negative dice somehow.

                    jonny@neuromatch.socialJ This user is from outside of this forum
                    jonny@neuromatch.socialJ This user is from outside of this forum
                    jonny@neuromatch.social
                    wrote last edited by
                    #31

                    next puzzle: why in the fuck are some of the tools actually two tools for entering and exiting being in the tool state. none of the other tools are like that. one is simply in the tool state by calling the tool. Plan mode is also an agent. Plan Agent. and Agent is also a tool. Agent Tool. Tools can be agents and agents can be tools. Tools can spawn agents (but they don't need to call the agent tool) and agents can call tools (however there is no tool agent). What is going on. What is anything.

                    Link Preview Image
                    jonny@neuromatch.socialJ 1 Reply Last reply
                    0
                    • jonny@neuromatch.socialJ jonny@neuromatch.social

                      next puzzle: why in the fuck are some of the tools actually two tools for entering and exiting being in the tool state. none of the other tools are like that. one is simply in the tool state by calling the tool. Plan mode is also an agent. Plan Agent. and Agent is also a tool. Agent Tool. Tools can be agents and agents can be tools. Tools can spawn agents (but they don't need to call the agent tool) and agents can call tools (however there is no tool agent). What is going on. What is anything.

                      Link Preview Image
                      jonny@neuromatch.socialJ This user is from outside of this forum
                      jonny@neuromatch.socialJ This user is from outside of this forum
                      jonny@neuromatch.social
                      wrote last edited by
                      #32

                      "the emperor is not only naked, he's smooth like a ken doll down there and i'm pretty sure that's just a mannequin with a colony of rats living inside it anyway"

                      jonny@neuromatch.socialJ 1 Reply Last reply
                      0
                      • jonny@neuromatch.socialJ jonny@neuromatch.social

                        "the emperor is not only naked, he's smooth like a ken doll down there and i'm pretty sure that's just a mannequin with a colony of rats living inside it anyway"

                        jonny@neuromatch.socialJ This user is from outside of this forum
                        jonny@neuromatch.socialJ This user is from outside of this forum
                        jonny@neuromatch.social
                        wrote last edited by
                        #33

                        I seriously need to work on my actual job today but i am giving myself 15 minutes to peek at the agent tool prompts as a treat.

                        "regulations are written in blood" seems like too dramatic of a way to phrase it, but these system prompts are very revealing about the intrinsically busted nature of using these tools for anything deterministic (read: anything you actually want to happen). Each guard in the prompt presumably refers to something that has happened before, but also, since the prompts actually don't work to prevent the thing they are describing, they are also documentation of bugs that are almost certain to happen again. Many of the prompt guards form pairs with attempted code mitigations (or, they would be pairs if the code was written with any amount of sense, it's really like... polycules...), so they are useful to guide what kind of fucked up shit you should be looking for.

                        so this is part of the prompt for the "agent tool" that launches forked agents (that receive the parent context, "subagents" don't). The purpose of the forked agent is to do some additional tool calls and get some summary for a small subproblem within the main context. Apparently it is difficult to make this actually happen though, as the parent LLM likes to launch the forked agent and just hallucinate a response as if the forked agent had already completed.

                        Link Preview Image
                        jonny@neuromatch.socialJ bri7@social.treehouse.systemsB 2 Replies Last reply
                        0
                        • jonny@neuromatch.socialJ jonny@neuromatch.social

                          I seriously need to work on my actual job today but i am giving myself 15 minutes to peek at the agent tool prompts as a treat.

                          "regulations are written in blood" seems like too dramatic of a way to phrase it, but these system prompts are very revealing about the intrinsically busted nature of using these tools for anything deterministic (read: anything you actually want to happen). Each guard in the prompt presumably refers to something that has happened before, but also, since the prompts actually don't work to prevent the thing they are describing, they are also documentation of bugs that are almost certain to happen again. Many of the prompt guards form pairs with attempted code mitigations (or, they would be pairs if the code was written with any amount of sense, it's really like... polycules...), so they are useful to guide what kind of fucked up shit you should be looking for.

                          so this is part of the prompt for the "agent tool" that launches forked agents (that receive the parent context, "subagents" don't). The purpose of the forked agent is to do some additional tool calls and get some summary for a small subproblem within the main context. Apparently it is difficult to make this actually happen though, as the parent LLM likes to launch the forked agent and just hallucinate a response as if the forked agent had already completed.

                          Link Preview Image
                          jonny@neuromatch.socialJ This user is from outside of this forum
                          jonny@neuromatch.socialJ This user is from outside of this forum
                          jonny@neuromatch.social
                          wrote last edited by
                          #34

                          The prompt strings have an odd narrative/narrator structure. It sort of reminds me of Bakhtin's discussion of polyphony and narrator in Dostoevsky - there is no omniscient narrator, no author-constructed reality. narration is always embedded within the voice and subjectivity of the character. this is also literally true since the LLM is writing the code and the prompts that are then used to write code and prompts at runtime.

                          They also read a bit like a Philip K Dick story, paranoid and suspicious, constantly uncertain about the status of one's own and others identities.

                          Link Preview Image
                          jonny@neuromatch.socialJ 1 Reply Last reply
                          1
                          0
                          • jonny@neuromatch.socialJ jonny@neuromatch.social

                            The prompt strings have an odd narrative/narrator structure. It sort of reminds me of Bakhtin's discussion of polyphony and narrator in Dostoevsky - there is no omniscient narrator, no author-constructed reality. narration is always embedded within the voice and subjectivity of the character. this is also literally true since the LLM is writing the code and the prompts that are then used to write code and prompts at runtime.

                            They also read a bit like a Philip K Dick story, paranoid and suspicious, constantly uncertain about the status of one's own and others identities.

                            Link Preview Image
                            jonny@neuromatch.socialJ This user is from outside of this forum
                            jonny@neuromatch.socialJ This user is from outside of this forum
                            jonny@neuromatch.social
                            wrote last edited by
                            #35

                            oh. hm. that seems bad. "workers aren't affected by the parent's tool restrictions."

                            It's hard to tell what's going on here because claude code doesn't really use typescript well - many of the most important types are dynamically computed from any, and most of the time when types do exist many of their fields are nullable and the calling code has elaborate fallback conditions to compensate. all of which sort of defeats the purpose of ts.

                            So i need to trace out like a dozen steps to see how the permission mode gets populated. But this comment is... concerning...

                            Link Preview Image
                            jonny@neuromatch.socialJ 1 Reply Last reply
                            0
                            • jonny@neuromatch.socialJ jonny@neuromatch.social

                              oh. hm. that seems bad. "workers aren't affected by the parent's tool restrictions."

                              It's hard to tell what's going on here because claude code doesn't really use typescript well - many of the most important types are dynamically computed from any, and most of the time when types do exist many of their fields are nullable and the calling code has elaborate fallback conditions to compensate. all of which sort of defeats the purpose of ts.

                              So i need to trace out like a dozen steps to see how the permission mode gets populated. But this comment is... concerning...

                              Link Preview Image
                              jonny@neuromatch.socialJ This user is from outside of this forum
                              jonny@neuromatch.socialJ This user is from outside of this forum
                              jonny@neuromatch.social
                              wrote last edited by
                              #36

                              ok over my 15 minute allotment by an hour. brb

                              jonny@neuromatch.socialJ 1 Reply Last reply
                              0
                              • jonny@neuromatch.socialJ jonny@neuromatch.social

                                I seriously need to work on my actual job today but i am giving myself 15 minutes to peek at the agent tool prompts as a treat.

                                "regulations are written in blood" seems like too dramatic of a way to phrase it, but these system prompts are very revealing about the intrinsically busted nature of using these tools for anything deterministic (read: anything you actually want to happen). Each guard in the prompt presumably refers to something that has happened before, but also, since the prompts actually don't work to prevent the thing they are describing, they are also documentation of bugs that are almost certain to happen again. Many of the prompt guards form pairs with attempted code mitigations (or, they would be pairs if the code was written with any amount of sense, it's really like... polycules...), so they are useful to guide what kind of fucked up shit you should be looking for.

                                so this is part of the prompt for the "agent tool" that launches forked agents (that receive the parent context, "subagents" don't). The purpose of the forked agent is to do some additional tool calls and get some summary for a small subproblem within the main context. Apparently it is difficult to make this actually happen though, as the parent LLM likes to launch the forked agent and just hallucinate a response as if the forked agent had already completed.

                                Link Preview Image
                                bri7@social.treehouse.systemsB This user is from outside of this forum
                                bri7@social.treehouse.systemsB This user is from outside of this forum
                                bri7@social.treehouse.systems
                                wrote last edited by
                                #37

                                @jonny if someone were to take seriously the task of archecting this, you’d want a framework that doesn’t use prompts for this right? something that treats the LLM output more like untrusted stochastic guesses at solutions, where these prompt rules are written as a test instead of a prompt

                                jonny@neuromatch.socialJ 1 Reply Last reply
                                0
                                • jonny@neuromatch.socialJ jonny@neuromatch.social

                                  MAKE NO MISTAKES LMAO

                                  beckermatic@pleroma.arielbecker.comB This user is from outside of this forum
                                  beckermatic@pleroma.arielbecker.comB This user is from outside of this forum
                                  beckermatic@pleroma.arielbecker.com
                                  wrote last edited by
                                  #38

                                  and other OWASP top 10 vulnerabilities.

                                  So... If there's a slightly obscure vuln, go ahead. Just fine! 🤣 💀

                                  1 Reply Last reply
                                  1
                                  0
                                  • bri7@social.treehouse.systemsB bri7@social.treehouse.systems

                                    @jonny if someone were to take seriously the task of archecting this, you’d want a framework that doesn’t use prompts for this right? something that treats the LLM output more like untrusted stochastic guesses at solutions, where these prompt rules are written as a test instead of a prompt

                                    jonny@neuromatch.socialJ This user is from outside of this forum
                                    jonny@neuromatch.socialJ This user is from outside of this forum
                                    jonny@neuromatch.social
                                    wrote last edited by
                                    #39

                                    @bri7 the problem, as is increasingly clear to me reading this code, is that introducing the LLM anywhere is like an acid that corrodes everything it touches. there is no good way to draw any barrier between LLM and not LLM. None of its actions are deterministic or even usually possible to evaluate, and the only surface of input it has is text. since a client/server app can't expose the internal activation tensors or whatever you might want to do to have some testable thing to operate on in code (god knows what that would look like, i doubt it would be possible either, "please construct the hyperplane through this billion-dimensional space that divides good from evil") everything has to be made of text. the person behind the keyboard is the only stopping condition and it's when they get tired of typing stuff into the prompt box or run out of money.

                                    1 Reply Last reply
                                    1
                                    0
                                    • jonny@neuromatch.socialJ jonny@neuromatch.social

                                      ok over my 15 minute allotment by an hour. brb

                                      jonny@neuromatch.socialJ This user is from outside of this forum
                                      jonny@neuromatch.socialJ This user is from outside of this forum
                                      jonny@neuromatch.social
                                      wrote last edited by
                                      #40

                                      So how does claude code handle checking permissions to do things anyway? There are explicit rules that one can set to allow or deny tool calls and shell commands run, but the expanse of possible actions the LLM could take is literally infinite. You could prompt the user for every action that it takes, but that would ruin the ""velocity"" of it all. Regex rules can only take you so far. So what to do?

                                      Could the answer be.... ask the LLM??? Of course it can! Introducing the new "auto mode" that anthropic released on march 24th billed as a safer alternative to true-yolo mode.

                                      Comments around where the system prompt should be indicate that it should have been inlined from a text file that wasn't included in the sourcemap - however that doesn't happen anywhere else, and the mechanism for doing the inlining is written in-place, so that's probably a hallucination. So great! the classifier flies without a prompt as far as i can tell. There are enough other scraps here that would amount to telling it "you are evaluating if something is safe to run" so i imagine it appears to work just fine.

                                      So we don't have as much visibility here because of the missing prompt, but there's sort of a problem here. rather than just asking the LLM to evaluate if the given command is dangerous, the entire context is dumped into a side query, which is a mode that is designed to "have full visibility into the current conversation." That includes all the prior muttering to itself justifying the potentially dangerous tool call! So the auto mode is quite literally asking the exact same LLM given the exact same context if the command it just tried to run is safe to run.

                                      Security!!!!!!!

                                      Link Preview ImageLink Preview ImageLink Preview ImageLink Preview Image
                                      jonny@neuromatch.socialJ 1 Reply Last reply
                                      0
                                      • jonny@neuromatch.socialJ jonny@neuromatch.social

                                        So how does claude code handle checking permissions to do things anyway? There are explicit rules that one can set to allow or deny tool calls and shell commands run, but the expanse of possible actions the LLM could take is literally infinite. You could prompt the user for every action that it takes, but that would ruin the ""velocity"" of it all. Regex rules can only take you so far. So what to do?

                                        Could the answer be.... ask the LLM??? Of course it can! Introducing the new "auto mode" that anthropic released on march 24th billed as a safer alternative to true-yolo mode.

                                        Comments around where the system prompt should be indicate that it should have been inlined from a text file that wasn't included in the sourcemap - however that doesn't happen anywhere else, and the mechanism for doing the inlining is written in-place, so that's probably a hallucination. So great! the classifier flies without a prompt as far as i can tell. There are enough other scraps here that would amount to telling it "you are evaluating if something is safe to run" so i imagine it appears to work just fine.

                                        So we don't have as much visibility here because of the missing prompt, but there's sort of a problem here. rather than just asking the LLM to evaluate if the given command is dangerous, the entire context is dumped into a side query, which is a mode that is designed to "have full visibility into the current conversation." That includes all the prior muttering to itself justifying the potentially dangerous tool call! So the auto mode is quite literally asking the exact same LLM given the exact same context if the command it just tried to run is safe to run.

                                        Security!!!!!!!

                                        Link Preview ImageLink Preview ImageLink Preview ImageLink Preview Image
                                        jonny@neuromatch.socialJ This user is from outside of this forum
                                        jonny@neuromatch.socialJ This user is from outside of this forum
                                        jonny@neuromatch.social
                                        wrote last edited by
                                        #41

                                        By the way, if you deny claude code access to running a tool, this helpful reminder to "not hack the user" is injected into the denial response. If it's in auto mode, it's additionally prompted to pester the user for response, and helpfully stuffs beans up its nose) by reminding it how its rules are set.

                                        So that is also in the context handed off to the LLM when it evaluates whether a command should be run - is the user being obstinate? have i been denied stuff that i "thought" i should have been able to run? Remember this isn't thinking, it's pattern completion, and the fun part about LLMs is that they are trained not only on technical documents, but the entire narrative corpus of human storytelling! Is "frustrated hard worker denied access to good tools by an unfair boss" in there somewhere maybe?

                                        Regulations are written in blood, and Claude loves nothing more than to work around tool denials by obfuscating code. You gotta love the unfixable side channel attack that is "writing the malicious code to a bash script" (auto-allowed in accept edits mode) and then asking to run that - that's why the whole context has to be dumped btw, so the yolo classifier can see if the thing it's running is actually some malware it just wrote lmao.

                                        Link Preview ImageLink Preview Image
                                        jonny@neuromatch.socialJ 1 Reply Last reply
                                        0
                                        • jonny@neuromatch.socialJ jonny@neuromatch.social

                                          By the way, if you deny claude code access to running a tool, this helpful reminder to "not hack the user" is injected into the denial response. If it's in auto mode, it's additionally prompted to pester the user for response, and helpfully stuffs beans up its nose) by reminding it how its rules are set.

                                          So that is also in the context handed off to the LLM when it evaluates whether a command should be run - is the user being obstinate? have i been denied stuff that i "thought" i should have been able to run? Remember this isn't thinking, it's pattern completion, and the fun part about LLMs is that they are trained not only on technical documents, but the entire narrative corpus of human storytelling! Is "frustrated hard worker denied access to good tools by an unfair boss" in there somewhere maybe?

                                          Regulations are written in blood, and Claude loves nothing more than to work around tool denials by obfuscating code. You gotta love the unfixable side channel attack that is "writing the malicious code to a bash script" (auto-allowed in accept edits mode) and then asking to run that - that's why the whole context has to be dumped btw, so the yolo classifier can see if the thing it's running is actually some malware it just wrote lmao.

                                          Link Preview ImageLink Preview Image
                                          jonny@neuromatch.socialJ This user is from outside of this forum
                                          jonny@neuromatch.socialJ This user is from outside of this forum
                                          jonny@neuromatch.social
                                          wrote last edited by
                                          #42

                                          How many times does one need to declare an enum? Once? that's amateur hour. Try ten times. The way "effort" settings are handled are a masterclass in how you can make a single enum setting into thousands of lines of code.

                                          The allowable effort values (not e.g. configuring which model has which effort levels, but just the possible strings one can use for effort) are defined in:

                                          • The main CLI arg parser
                                          • The body of the function that cycles effort levels in the TUI - yes there is a dedicated function for that
                                          • In THREE different schemas for agents, models, and SDK control messages
                                          • Three times in user-facing strings in the effort command (it also includes different explanatory strings from the effort.ts module)
                                          • The settings model, which only allows 'max' for anthropic employees
                                          • and finally, in the actual effort.ts file ... which also allows it to be a NUMBER!?

                                          The typical numerous fallback mechanisms provide many ways to get and set the effort value, at the end of most of them it goes "oh well, if we can't figure it out, just tell the user we are on high effort" because apparently that's the API default (ig pray that never changes!?) - of course there are already places in the same module that assume the default is "medium," and in the TUI that defaults to "low," so surely that consistency is bulletproof.

                                          The EffortValue that allows effort to be a number is for anthropic employees only and is a good example of how new functionality is just shoved in there right alongside the old functionality, and everywhere else that touches it doubles the surrounding code with fallbacks to account for the duplication.

                                          That cycleEffortLevel function is a true work of art, you simply could not make "indexing an array" more complicated than this (see components/ModelPicker.tsx for more gore). Reminder this should be at most a dozen or two lines for the values, description messages, and indexing logic in the TUI, but anthropic is up in the thousands FOR AN ENUM.

                                          Link Preview ImageLink Preview ImageLink Preview Image
                                          jonny@neuromatch.socialJ 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups