Reading a claude.md and seeing written there many directives like this: "Don't make changes until you have 95% confidence in what you need to build."
-
Reading a claude.md and seeing written there many directives like this: "Don't make changes until you have 95% confidence in what you need to build."
This reveals such a profound misunderstanding of how this technology works that I'm speechless. And this is literally what people are trying to build fully-automated "software factories" from.
@cammerman @mhoye I don’t suppose I could impose upon you to give mine a quick review? https://github.com/pmonks/wreck/blob/dev/AGENTS.md
-
Look, if you tell an LLM it needs 95% confidence, it doesn't know what either "95%" nor "confidence" means. It knows people tend to respond to this kind of direction either by saying "I'm not sure enough because..." or "I'm super confident for these reasons." It has no ability to correctly choose which of those templates it will follow.
Flip a coin. You'll get a reasonable looking sentence back in one of those styles, with a random assortment of reasons that may or may not be rooted in fact.
This is an all too common failure mode (of humans). The system is working as designed, being a next most probable token generator.
What I keep encountering is humans failing to grasp that the LLM has no world model, and no sense whatsoever of "meaning" or "truth" of anything, ever.
Because a lot of world model, truth-based, reasoning is implicitly encoded in language, frequently true things are probable next tokens.
This makes humans think "understanding" happens. It doesn't.
-
Look, if you tell an LLM it needs 95% confidence, it doesn't know what either "95%" nor "confidence" means. It knows people tend to respond to this kind of direction either by saying "I'm not sure enough because..." or "I'm super confident for these reasons." It has no ability to correctly choose which of those templates it will follow.
Flip a coin. You'll get a reasonable looking sentence back in one of those styles, with a random assortment of reasons that may or may not be rooted in fact.
No matter how many prompt contexts you stack up and fire off in parallel, the machine cannot find truth, cannot do math, cannot know things, or reason.
It's Massively Multiplayer Online Autocomplete.
The fact that the capital and executive class thinks this is sufficient to replace most of the world's knowledge workers tells you all you need to know about how we should be dealing with them, and all of this.
-
This is an all too common failure mode (of humans). The system is working as designed, being a next most probable token generator.
What I keep encountering is humans failing to grasp that the LLM has no world model, and no sense whatsoever of "meaning" or "truth" of anything, ever.
Because a lot of world model, truth-based, reasoning is implicitly encoded in language, frequently true things are probable next tokens.
This makes humans think "understanding" happens. It doesn't.
@pseudonym @cammerman If you ask the magic 8-ball about frequently truthfully described or discussed things, or things of similar structure with a clear mapping, it's more likely to produce a correct answer, but you have no idea what's frequently described and whether it's just randomly wrong. Oh, and if it is wrong, it is optimized to make the wrong answer look right in context. Good luck.
-
@cammerman @mhoye I don’t suppose I could impose upon you to give mine a quick review? https://github.com/pmonks/wreck/blob/dev/AGENTS.md
@pmonks @cammerman @mhoye I'm curious to see some of the limericks that are generated by these instructions.
-
-
Look, if you tell an LLM it needs 95% confidence, it doesn't know what either "95%" nor "confidence" means. It knows people tend to respond to this kind of direction either by saying "I'm not sure enough because..." or "I'm super confident for these reasons." It has no ability to correctly choose which of those templates it will follow.
Flip a coin. You'll get a reasonable looking sentence back in one of those styles, with a random assortment of reasons that may or may not be rooted in fact.
-
Look, if you tell an LLM it needs 95% confidence, it doesn't know what either "95%" nor "confidence" means. It knows people tend to respond to this kind of direction either by saying "I'm not sure enough because..." or "I'm super confident for these reasons." It has no ability to correctly choose which of those templates it will follow.
Flip a coin. You'll get a reasonable looking sentence back in one of those styles, with a random assortment of reasons that may or may not be rooted in fact.
@cammerman Possibly more important, it doesn't know what "don't" means. I'm not even making a philosophical point about what it is to "know" something, even if we only care about output, it doesn't act in a way that corresponds to following an instruction not to do anything. Don't mention goblins: no effect on how often goblins get mentioned if the training was weighted towards mentioning creatures.
-
Reading a claude.md and seeing written there many directives like this: "Don't make changes until you have 95% confidence in what you need to build."
This reveals such a profound misunderstanding of how this technology works that I'm speechless. And this is literally what people are trying to build fully-automated "software factories" from.
-
@cammerman Possibly more important, it doesn't know what "don't" means. I'm not even making a philosophical point about what it is to "know" something, even if we only care about output, it doesn't act in a way that corresponds to following an instruction not to do anything. Don't mention goblins: no effect on how often goblins get mentioned if the training was weighted towards mentioning creatures.
@individeweal @cammerman
I told ChatGPT "don't apologize" so it apologized for apologizing. -
Reading a claude.md and seeing written there many directives like this: "Don't make changes until you have 95% confidence in what you need to build."
This reveals such a profound misunderstanding of how this technology works that I'm speechless. And this is literally what people are trying to build fully-automated "software factories" from.
Retterin der Autoindustrie (Chinas)
-
No matter how many prompt contexts you stack up and fire off in parallel, the machine cannot find truth, cannot do math, cannot know things, or reason.
It's Massively Multiplayer Online Autocomplete.
The fact that the capital and executive class thinks this is sufficient to replace most of the world's knowledge workers tells you all you need to know about how we should be dealing with them, and all of this.
@cammerman "Massively Multiplayer Online Autocomplete", lol
I am officially calling it this from now on!
-
R relay@relay.mycrowd.ca shared this topic

No notes.