I can't stop thinking about the LLM-generated compiler that passes all the unit tests but emits inner loops that benchmark over 150,000x slower than a gcc debug build.

0xabad1dea@infosec.exchange

I can't stop thinking about the LLM-generated compiler that passes all the unit tests but emits inner loops that benchmark over 150,000x slower than a gcc debug build. I couldn't possibly have intentionally come up with such a funny demonstration of the point of genuine expertise https://harshanu.space/en/tech/ccc-vs-gcc/

dysfun@social.treehouse.systems

@0xabad1dea wait what, i missed the 150k slower thing

dysfun@social.treehouse.systems

@0xabad1dea amazing

lesley@mastodon.gamedev.place

@0xabad1dea I have a feeling that this writing replies on LLM way too much

0xabad1dea@infosec.exchange

@lesley sometimes I feel like the only person in tech who knows how to write three consecutive paragraphs all by herself

sodiboo@gaysex.cloud

@lesley@mastodon.gamedev.place @0xabad1dea@infosec.exchange There's a disclaimer at the bottom of the blog post stating that "The benchmark design, test execution, analysis and writing were done by a human with AI helping where needed."

nina_kali_nina@tech.lgbt

@0xabad1dea makes two of us. The CCC isn't the flex AI proponents think it is, but there aren't enough people who can understand that it should have been a cautionary tale rather than a sensational headline.

jnpn@mastodon.social

@0xabad1dea very interesting read

nina_kali_nina@tech.lgbt

@0xabad1dea like, I'll bait; great stuff, unsupervised agent produced something that can compile some C code that in a certain definition can be called "working", but absolutely not ready for any sort of production usage.
The agent has multiple reference implementations, extensive testing suite, and C is literally based on an extremely well defined standard. AI proponents claim that we're in an era where all we need is to provide a specification, and the agents will just implement the thing for us. This CCC thing is proof that they quite literally can't; it's difficult to think about a commercial software project that would have a specification better defined than the C standard. And a vanilla C compiler isn't all _that_ complicated, it's literally the kind of thing many undergrad SWE students build as a student project (yes yes lots of caveats and simplifications). You'd think Anthropic could improve on their CCC with the agents until they get the compiler working at least as well as the tcc would, but 1/2

thing@plasmatrap.com

@0xabad1dea@infosec.exchange claude has a fucking compiler.
what the fuck.
are we vibecompiling alongside vibecoding now

CIRCLE WITH A DOT

I can't stop thinking about the LLM-generated compiler that passes all the unit tests but emits inner loops that benchmark over 150,000x slower than a gcc debug build.