Claude code source "leaks" in a mapfile
-
So how does claude code handle checking permissions to do things anyway? There are explicit rules that one can set to allow or deny tool calls and shell commands run, but the expanse of possible actions the LLM could take is literally infinite. You could prompt the user for every action that it takes, but that would ruin the ""velocity"" of it all. Regex rules can only take you so far. So what to do?
Could the answer be.... ask the LLM??? Of course it can! Introducing the new "auto mode" that anthropic released on march 24th billed as a safer alternative to true-yolo mode.
Comments around where the system prompt should be indicate that it should have been inlined from a text file that wasn't included in the sourcemap - however that doesn't happen anywhere else, and the mechanism for doing the inlining is written in-place, so that's probably a hallucination. So great! the classifier flies without a prompt as far as i can tell. There are enough other scraps here that would amount to telling it "you are evaluating if something is safe to run" so i imagine it appears to work just fine.
So we don't have as much visibility here because of the missing prompt, but there's sort of a problem here. rather than just asking the LLM to evaluate if the given command is dangerous, the entire context is dumped into a side query, which is a mode that is designed to "have full visibility into the current conversation." That includes all the prior muttering to itself justifying the potentially dangerous tool call! So the auto mode is quite literally asking the exact same LLM given the exact same context if the command it just tried to run is safe to run.
Security!!!!!!!




By the way, if you deny claude code access to running a tool, this helpful reminder to "not hack the user" is injected into the denial response. If it's in auto mode, it's additionally prompted to pester the user for response, and helpfully stuffs beans up its nose) by reminding it how its rules are set.
So that is also in the context handed off to the LLM when it evaluates whether a command should be run - is the user being obstinate? have i been denied stuff that i "thought" i should have been able to run? Remember this isn't thinking, it's pattern completion, and the fun part about LLMs is that they are trained not only on technical documents, but the entire narrative corpus of human storytelling! Is "frustrated hard worker denied access to good tools by an unfair boss" in there somewhere maybe?
Regulations are written in blood, and Claude loves nothing more than to work around tool denials by obfuscating code. You gotta love the unfixable side channel attack that is "writing the malicious code to a bash script" (auto-allowed in accept edits mode) and then asking to run that - that's why the whole context has to be dumped btw, so the yolo classifier can see if the thing it's running is actually some malware it just wrote lmao.


-
By the way, if you deny claude code access to running a tool, this helpful reminder to "not hack the user" is injected into the denial response. If it's in auto mode, it's additionally prompted to pester the user for response, and helpfully stuffs beans up its nose) by reminding it how its rules are set.
So that is also in the context handed off to the LLM when it evaluates whether a command should be run - is the user being obstinate? have i been denied stuff that i "thought" i should have been able to run? Remember this isn't thinking, it's pattern completion, and the fun part about LLMs is that they are trained not only on technical documents, but the entire narrative corpus of human storytelling! Is "frustrated hard worker denied access to good tools by an unfair boss" in there somewhere maybe?
Regulations are written in blood, and Claude loves nothing more than to work around tool denials by obfuscating code. You gotta love the unfixable side channel attack that is "writing the malicious code to a bash script" (auto-allowed in accept edits mode) and then asking to run that - that's why the whole context has to be dumped btw, so the yolo classifier can see if the thing it's running is actually some malware it just wrote lmao.


How many times does one need to declare an enum? Once? that's amateur hour. Try ten times. The way "effort" settings are handled are a masterclass in how you can make a single enum setting into thousands of lines of code.
The allowable effort values (not e.g. configuring which model has which effort levels, but just the possible strings one can use for effort) are defined in:
- The main CLI arg parser
- The body of the function that cycles effort levels in the TUI - yes there is a dedicated function for that
- In THREE different schemas for agents, models, and SDK control messages
- Three times in user-facing strings in the effort command (it also includes different explanatory strings from the effort.ts module)
- The settings model, which only allows 'max' for anthropic employees
- and finally, in the actual
effort.tsfile ... which also allows it to be a NUMBER!?
The typical numerous fallback mechanisms provide many ways to get and set the effort value, at the end of most of them it goes "oh well, if we can't figure it out, just tell the user we are on high effort" because apparently that's the API default (ig pray that never changes!?) - of course there are already places in the same module that assume the default is "medium," and in the TUI that defaults to "low," so surely that consistency is bulletproof.
The
EffortValuethat allows effort to be a number is for anthropic employees only and is a good example of how new functionality is just shoved in there right alongside the old functionality, and everywhere else that touches it doubles the surrounding code with fallbacks to account for the duplication.That
cycleEffortLevelfunction is a true work of art, you simply could not make "indexing an array" more complicated than this (seecomponents/ModelPicker.tsxfor more gore). Reminder this should be at most a dozen or two lines for the values, description messages, and indexing logic in the TUI, but anthropic is up in the thousands FOR AN ENUM.


-
How many times does one need to declare an enum? Once? that's amateur hour. Try ten times. The way "effort" settings are handled are a masterclass in how you can make a single enum setting into thousands of lines of code.
The allowable effort values (not e.g. configuring which model has which effort levels, but just the possible strings one can use for effort) are defined in:
- The main CLI arg parser
- The body of the function that cycles effort levels in the TUI - yes there is a dedicated function for that
- In THREE different schemas for agents, models, and SDK control messages
- Three times in user-facing strings in the effort command (it also includes different explanatory strings from the effort.ts module)
- The settings model, which only allows 'max' for anthropic employees
- and finally, in the actual
effort.tsfile ... which also allows it to be a NUMBER!?
The typical numerous fallback mechanisms provide many ways to get and set the effort value, at the end of most of them it goes "oh well, if we can't figure it out, just tell the user we are on high effort" because apparently that's the API default (ig pray that never changes!?) - of course there are already places in the same module that assume the default is "medium," and in the TUI that defaults to "low," so surely that consistency is bulletproof.
The
EffortValuethat allows effort to be a number is for anthropic employees only and is a good example of how new functionality is just shoved in there right alongside the old functionality, and everywhere else that touches it doubles the surrounding code with fallbacks to account for the duplication.That
cycleEffortLevelfunction is a true work of art, you simply could not make "indexing an array" more complicated than this (seecomponents/ModelPicker.tsxfor more gore). Reminder this should be at most a dozen or two lines for the values, description messages, and indexing logic in the TUI, but anthropic is up in the thousands FOR AN ENUM.


In a normal program you might make "a menu component that handles enums and implement display and control one time," but in the world of AI, every single value reimplements display and control AND the logic that defines allowable values