I can't stop thinking about the LLM-generated compiler that passes all the unit tests but emits inner loops that benchmark over 150,000x slower than a gcc debug build.
-
I can't stop thinking about the LLM-generated compiler that passes all the unit tests but emits inner loops that benchmark over 150,000x slower than a gcc debug build. I couldn't possibly have intentionally come up with such a funny demonstration of the point of genuine expertise https://harshanu.space/en/tech/ccc-vs-gcc/
-
I can't stop thinking about the LLM-generated compiler that passes all the unit tests but emits inner loops that benchmark over 150,000x slower than a gcc debug build. I couldn't possibly have intentionally come up with such a funny demonstration of the point of genuine expertise https://harshanu.space/en/tech/ccc-vs-gcc/
@0xabad1dea wait what, i missed the 150k slower thing
-
@0xabad1dea amazing
-
I can't stop thinking about the LLM-generated compiler that passes all the unit tests but emits inner loops that benchmark over 150,000x slower than a gcc debug build. I couldn't possibly have intentionally come up with such a funny demonstration of the point of genuine expertise https://harshanu.space/en/tech/ccc-vs-gcc/
@0xabad1dea I have a feeling that this writing replies on LLM way too much
-
@0xabad1dea I have a feeling that this writing replies on LLM way too much
@lesley sometimes I feel like the only person in tech who knows how to write three consecutive paragraphs all by herself
-
@0xabad1dea I have a feeling that this writing replies on LLM way too much
@lesley@mastodon.gamedev.place @0xabad1dea@infosec.exchange There's a disclaimer at the bottom of the blog post stating that "The benchmark design, test execution, analysis and writing were done by a human with AI helping where needed."
-
I can't stop thinking about the LLM-generated compiler that passes all the unit tests but emits inner loops that benchmark over 150,000x slower than a gcc debug build. I couldn't possibly have intentionally come up with such a funny demonstration of the point of genuine expertise https://harshanu.space/en/tech/ccc-vs-gcc/
@0xabad1dea makes two of us. The CCC isn't the flex AI proponents think it is, but there aren't enough people who can understand that it should have been a cautionary tale rather than a sensational headline.

-
I can't stop thinking about the LLM-generated compiler that passes all the unit tests but emits inner loops that benchmark over 150,000x slower than a gcc debug build. I couldn't possibly have intentionally come up with such a funny demonstration of the point of genuine expertise https://harshanu.space/en/tech/ccc-vs-gcc/
@0xabad1dea very interesting read
-
@0xabad1dea makes two of us. The CCC isn't the flex AI proponents think it is, but there aren't enough people who can understand that it should have been a cautionary tale rather than a sensational headline.

@0xabad1dea like, I'll bait; great stuff, unsupervised agent produced something that can compile some C code that in a certain definition can be called "working", but absolutely not ready for any sort of production usage.
The agent has multiple reference implementations, extensive testing suite, and C is literally based on an extremely well defined standard. AI proponents claim that we're in an era where all we need is to provide a specification, and the agents will just implement the thing for us. This CCC thing is proof that they quite literally can't; it's difficult to think about a commercial software project that would have a specification better defined than the C standard. And a vanilla C compiler isn't all _that_ complicated, it's literally the kind of thing many undergrad SWE students build as a student project (yes yes lots of caveats and simplifications). You'd think Anthropic could improve on their CCC with the agents until they get the compiler working at least as well as the tcc would, but 1/2 -
I can't stop thinking about the LLM-generated compiler that passes all the unit tests but emits inner loops that benchmark over 150,000x slower than a gcc debug build. I couldn't possibly have intentionally come up with such a funny demonstration of the point of genuine expertise https://harshanu.space/en/tech/ccc-vs-gcc/
@0xabad1dea@infosec.exchange claude has a fucking compiler.
what the fuck.
are we vibecompiling alongside vibecoding now
-
R relay@relay.infosec.exchange shared this topic