I keep seeing lots of people saying "LLMs are like compilers/assemblers for prompts"
-
I keep seeing lots of people saying "LLMs are like compilers/assemblers for prompts"
Noooooooooo
NooooooooooooooooooooooooooooLLMs are not compilers, and they're not assemblers. Determinism is a key aspect to assemblers and compilers.
And they *certainly* can't be part of a reproducible pipeline
@cwebber This might actually be subject to change though.
Njoy: https://arxiv.org/abs/2510.22954
Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
tl;dr: LLMs are coming closer and closer to conveying reproducible outputs. One could be under the impression that if trained on the same data and towards a certain size asymtotic behaviour would be a resonable expectation, becaus that happens with large numbers in statistics.
What a ... surprise.
-
@joeyh I'm glad to see that someone else has considered this angle. It's always bugged me a little when I see the "they aren't deterministic" argument, but I've kept it to myself because nobody likes a pedant and of course @cwebber already understands as much.
I just worry that if this critique were to become more popular then the LLM makers would just implement the ability to specify a seed, then sit back and play the game where they say
we heard your criticism and have addressed it
Most people have no reason to have developed an advanced reasoning capacity about randomness, and I dread having to explain to them how something can be both deterministic and stochastic in nature
@ansuz @joeyh And of course there is the question, what is and isn't a compiler? Aren't all functions compilers?
Indeed, Blender's rendering system is in many ways a compiler for images.
But we don't use that way, because it's not helpful, even though Blender and ffmpeg are MORE of compilers than LLMs are. People are reaching for "LLMs might be compilers!" because of the thing they want it to *do* rather than how it *acts*, even though Blender and ffmpeg are by far, under those definitions, much more of compilers than LLMs are.
-
@ansuz @joeyh And of course there is the question, what is and isn't a compiler? Aren't all functions compilers?
Indeed, Blender's rendering system is in many ways a compiler for images.
But we don't use that way, because it's not helpful, even though Blender and ffmpeg are MORE of compilers than LLMs are. People are reaching for "LLMs might be compilers!" because of the thing they want it to *do* rather than how it *acts*, even though Blender and ffmpeg are by far, under those definitions, much more of compilers than LLMs are.
-
I keep seeing lots of people saying "LLMs are like compilers/assemblers for prompts"
Noooooooooo
NooooooooooooooooooooooooooooLLMs are not compilers, and they're not assemblers. Determinism is a key aspect to assemblers and compilers.
And they *certainly* can't be part of a reproducible pipeline
@cwebber The metaphor I reach for is processors. They're language coprocessors, and language is messy in a way most things coprocessors have done aren't. We're at "Hello World" in figuring out what to do with them.
-
I keep seeing lots of people saying "LLMs are like compilers/assemblers for prompts"
Noooooooooo
NooooooooooooooooooooooooooooLLMs are not compilers, and they're not assemblers. Determinism is a key aspect to assemblers and compilers.
And they *certainly* can't be part of a reproducible pipeline
@cwebber ok i'm going to be very annoying here but
don't some old versions of msvc choose certain optimisations randomly ?
-
I keep seeing lots of people saying "LLMs are like compilers/assemblers for prompts"
Noooooooooo
NooooooooooooooooooooooooooooLLMs are not compilers, and they're not assemblers. Determinism is a key aspect to assemblers and compilers.
And they *certainly* can't be part of a reproducible pipeline
@cwebber for me, the question isn't determinism but epistemology. the llm "compiles" by chaining predictions based on statistics which are derived from empirical data—i.e. its model of the "compilation" process is "usually when there's x in the input, there's y in the output." a conventional compiler is based on deductive reasoning about how x requires y. the former is totally parasitic on the latter (i.e. if the underlying reasoning didn't exist, empirical data on its operation couldn't exist)
-
I keep seeing lots of people saying "LLMs are like compilers/assemblers for prompts"
Noooooooooo
NooooooooooooooooooooooooooooLLMs are not compilers, and they're not assemblers. Determinism is a key aspect to assemblers and compilers.
And they *certainly* can't be part of a reproducible pipeline
@cwebber@social.coop to be fair I don't think determinism is a defining property of compilers
I should make a stochastic compiler (whatever that means) -
@alina@girldick.gay @cwebber@social.coop @joeyh@sunbeam.city try mewgenics try mewgenics try mewgenics
-
@joeyh I'm glad to see that someone else has considered this angle. It's always bugged me a little when I see the "they aren't deterministic" argument, but I've kept it to myself because nobody likes a pedant and of course @cwebber already understands as much.
I just worry that if this critique were to become more popular then the LLM makers would just implement the ability to specify a seed, then sit back and play the game where they say
we heard your criticism and have addressed it
Most people have no reason to have developed an advanced reasoning capacity about randomness, and I dread having to explain to them how something can be both deterministic and stochastic in nature
Ah but even if you can use a specific seed and try to use this to call it a "compiler", your compiler here is the very specific sets of weights within that model, and any change breaks its determinism. I think there being one and exactly one possible implementation to get the specified set of outputs can count as an actual compiler.