Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Welp, for the first semester ever, SOTA LLMs can do *every single assignment, from scratch (readmes, etc.), and get 100%*.

Welp, for the first semester ever, SOTA LLMs can do *every single assignment, from scratch (readmes, etc.), and get 100%*.

Scheduled Pinned Locked Moved Uncategorized
52 Posts 17 Posters 86 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • cross@discuss.systemsC cross@discuss.systems

    @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek really?! That's depressing.

    I find these things _maddening_ to use. It feels like trying to neatly typeset your ideas by by dragging wet toilet paper dipped in ink across a piece of sandpaper.

    Are they capable of some impressive things? Yes. Do I think they're a good tool as an augment for a sophisticated user to go faster? Honestly, not really. The NLP aspect is neat; the multiple round-trips through English to <whatever it does internally> back to English are excruciatingly slow, expensive, and inefficient. It's not a good use of my, or frankly the machine's, time, let alone electrical power or water.

    Case in point: some colleagues the other day were saying something like, "I just can't get it to use `jq` instead of writing little Python scripts to process JSON....Here's what I put in my CLAUDE.md file: <some sentence along the lines of, 'prefer jq for working with json'>." I couldn't help but feel like this is exactly the sort of thing where you want the concise precision of a small DSL for assigning weights to tools (and providing templates for those tools' use) to drive how the agent uses them. But you can't do that, because the agent only trades in text.

    Like I said, there's clearly a "there" there. But setting aside the moral and ethical issues for a moment, that doesn't mean that the present model of interaction is _good_, let alone that it can't be substantially _better_.

    lindsey@recurse.socialL This user is from outside of this forum
    lindsey@recurse.socialL This user is from outside of this forum
    lindsey@recurse.social
    wrote last edited by
    #29

    @cross @krismicinski @shriramk @jfdm @csgordon @jeremysiek It seems like folks sooner or later notice that this whole "the agent only trades in text" thing is Not Great and proceed to reinvent programming languages on top of it. So, you know, when that happens, we PL educators are here to try to help them not accidentally implement dynamic scope or whatever.

    cross@discuss.systemsC shriramk@mastodon.socialS 2 Replies Last reply
    0
    • lindsey@recurse.socialL This user is from outside of this forum
      lindsey@recurse.socialL This user is from outside of this forum
      lindsey@recurse.social
      wrote last edited by
      #30

      @krismicinski @cross @shriramk @jfdm @csgordon @jeremysiek Kris, I feel like any time I say anything to you on here, you say, "I agree with you." Are you actually an LLM?

      1 Reply Last reply
      0
      • cross@discuss.systemsC This user is from outside of this forum
        cross@discuss.systemsC This user is from outside of this forum
        cross@discuss.systems
        wrote last edited by
        #31

        @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek it already _is_ a major thing for practitioners. But the whole interaction model is just painful. That's where it could improve 10x (and probably reduce the cost significantly).

        What I, as a practitioner, actually want is something that augments my abilities to do good work faster. What I've gotten is basically a rubber duck pair programming partner that can type really, really fast. Is that useful? I guess so. Is that what I actually want? No, not really.

        Here's an example of what I mean: I have long been deeply skeptical of these tools, but realized I was being knee-jerk in my criticisms and reaction to them, which isn't a good basis for critique. So I decided to do some experiments.

        I took the "LinBOQ" package, which is sort of a swiss-army knife program for amateur "packet" radio; it does it all: AX.25 and NET/ROM protocol handling, driving various RF modems over a many different transports, human generated message and "bulletin" handling and forwarding, a "chat" mechanism, a "BBS", TCP/IP over AX.25 over RF; all that good stuff. `wc -l` says it's about 300 KLOC of C, while sightly more sophisticated line counters clock it at ~180kloc. There is no automated testing of any kind. It is open source, and has been in more or less continuous development by one primary developer since the 1990s. In this regard, it's probably representative of many such large, mature, projects that have evolved more or less continually over many years. The general application domain is something I have a (mild) personal interest in, but it's not a critical use case; amateur radio is just a hobby.

        Anyway, I thought that it would be an interesting experiment to point an LLM at it to see if I could make substantive improvements would be an interesting experiment. So I threw Claude at it and said, "convert this to Rust."

        Sure enough, 24 hours later, it had produced a bunch of Rust code. That code even compiled. Impressive: even if I knew exactly how to do that transformation myself, I could never type that fast.

        But it didn't work: the LLM kept getting really confused; I would watch it argue with itself, I'd tell it to generate a test, but it'd say, "nah, that's too hard; let me just extract the algorithm and test that; yup, that works: tests pass!" It kept making the same mistakes over and over again; kept repeating the same anti-patterns over and over again; kept arguing with itself about my intent ... over and over again. It would present the same things over and over again, asking me to tweak small parts each time, and then hit the limit of its context window, "compact" it, and lose all of that state, declare it was done with the current task, and move on; then realize nothing worked and revert the earlier partial work. It would write code, but never hook it up to anything, so it wasn't called...then go and delete it because it was "dead code." I would make an offhand comment to it about something it was doing and it would veer off into some tangent, re-enter plan mode, and forget all about what it was doing and never go back to the original plan. It'll continually run `grep` over and over again, instead of, say, working with an LSP to cache an AST derived from the actual program text.

        Yeah, I've gone down the road of ever-more elaborate CLAUDE.md files (that I then have to simplify in various ways so that the machine can understand them), setting up MCPs so it can try to lean on a tree-sitter grammar instead of running `grep` all the time; telling it to use `rg` and generate JSON (surely it understand structured data formats?! Nope; it just wants text. It leaves notes _for itself_ ... in text) instead of whatever it does. Sometimes that works, sometimes it doesn't; it's mysterious. But after the $n$th time of sitting and prompting with, "tell me how to prevent compaction amnesia..." I find myself despairing for a better way. I think I've finally configured it to save plan steps, but hell if I know exactly what I tweaked to do it, or if introducing a comma in a markdown file sometime down the road will break it.

        So yes. It _is_ impressive, but as I keep going through the exercise, (and yes, the code _is_ improving) I cannot help but ask, "is this _really_ how I want to do this?" And time and again, the answer is no: i'm not interested in ever more elaborate mechanisms to help the tool overcome its own limitations; I have no desire to have deep, meaningful conversations with a machine about how I can more usefully speak to it to help it do the things I tell it better. Is this _really_ how I have to spend the rest of my career? If so, I'm quitting and moving to a farm. And I can't help thinking that if I didn't have as much experience as I do, the program would be getting far worse over time; as it is, I have to put a _lot_ of energy into guiding the LLM to do what I want.

        What I _want_ is something far more targeted that augments my ability to work with large bodies of software: scenarios like I'm looking at $this specific code, in an editor, and I want the machine to cross-reference the protocol it's implementing against a known spec; I think there's a deviation in this specific area, pop up a simulation that exercises while varying the following parameters...; or there's a specific code pattern that I want it to identify throughout the code base, and analyze the structure of, and apply some transformation to it; here's a thought about this, cross-reference with existing test patterns; here's the intended behavior, generate an end-to-end test for it and identify areas where that's difficult because of the structure of the code. That kind of thing would be useful. And the LLMs _can_ do a simulacrum of it, but wow is it painful to get them to do it. It takes so much of my time and energy I wonder if it's worth it, let alone actual energy, water, etc. And I can't help but have this really seriously sinking feeling that once the big players start passing on the cost of this machines to the rest of us, only a few players will actually have the capital resources to play. Those that do will have a serious advantage over those that don't, and everyone else will have to pay to play or lose out. That's not good.

        The kicker is, much of that already exists; modern text editors that integrate with LSPs can already rename symbols, or rewrite segments of code, or "extract method" or whatever and have been able to for years. What's new is the natural speech part, but you can't get away from it.

        Anyway, it's still chewing up tokens generating sans-IO state machines for these ancient ham radio protocols; it cost $275 today.

        krismicinski@types.plK 1 Reply Last reply
        0
        • lindsey@recurse.socialL lindsey@recurse.social

          @cross @krismicinski @shriramk @jfdm @csgordon @jeremysiek It seems like folks sooner or later notice that this whole "the agent only trades in text" thing is Not Great and proceed to reinvent programming languages on top of it. So, you know, when that happens, we PL educators are here to try to help them not accidentally implement dynamic scope or whatever.

          cross@discuss.systemsC This user is from outside of this forum
          cross@discuss.systemsC This user is from outside of this forum
          cross@discuss.systems
          wrote last edited by
          #32

          @lindsey @krismicinski @shriramk @jfdm @csgordon @jeremysiek I will sit on the top of the roof of this house as the water rises waiting for you all to come along in the lifeboat and rescue me. Please hurry!

          1 Reply Last reply
          0
          • cross@discuss.systemsC cross@discuss.systems

            @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek it already _is_ a major thing for practitioners. But the whole interaction model is just painful. That's where it could improve 10x (and probably reduce the cost significantly).

            What I, as a practitioner, actually want is something that augments my abilities to do good work faster. What I've gotten is basically a rubber duck pair programming partner that can type really, really fast. Is that useful? I guess so. Is that what I actually want? No, not really.

            Here's an example of what I mean: I have long been deeply skeptical of these tools, but realized I was being knee-jerk in my criticisms and reaction to them, which isn't a good basis for critique. So I decided to do some experiments.

            I took the "LinBOQ" package, which is sort of a swiss-army knife program for amateur "packet" radio; it does it all: AX.25 and NET/ROM protocol handling, driving various RF modems over a many different transports, human generated message and "bulletin" handling and forwarding, a "chat" mechanism, a "BBS", TCP/IP over AX.25 over RF; all that good stuff. `wc -l` says it's about 300 KLOC of C, while sightly more sophisticated line counters clock it at ~180kloc. There is no automated testing of any kind. It is open source, and has been in more or less continuous development by one primary developer since the 1990s. In this regard, it's probably representative of many such large, mature, projects that have evolved more or less continually over many years. The general application domain is something I have a (mild) personal interest in, but it's not a critical use case; amateur radio is just a hobby.

            Anyway, I thought that it would be an interesting experiment to point an LLM at it to see if I could make substantive improvements would be an interesting experiment. So I threw Claude at it and said, "convert this to Rust."

            Sure enough, 24 hours later, it had produced a bunch of Rust code. That code even compiled. Impressive: even if I knew exactly how to do that transformation myself, I could never type that fast.

            But it didn't work: the LLM kept getting really confused; I would watch it argue with itself, I'd tell it to generate a test, but it'd say, "nah, that's too hard; let me just extract the algorithm and test that; yup, that works: tests pass!" It kept making the same mistakes over and over again; kept repeating the same anti-patterns over and over again; kept arguing with itself about my intent ... over and over again. It would present the same things over and over again, asking me to tweak small parts each time, and then hit the limit of its context window, "compact" it, and lose all of that state, declare it was done with the current task, and move on; then realize nothing worked and revert the earlier partial work. It would write code, but never hook it up to anything, so it wasn't called...then go and delete it because it was "dead code." I would make an offhand comment to it about something it was doing and it would veer off into some tangent, re-enter plan mode, and forget all about what it was doing and never go back to the original plan. It'll continually run `grep` over and over again, instead of, say, working with an LSP to cache an AST derived from the actual program text.

            Yeah, I've gone down the road of ever-more elaborate CLAUDE.md files (that I then have to simplify in various ways so that the machine can understand them), setting up MCPs so it can try to lean on a tree-sitter grammar instead of running `grep` all the time; telling it to use `rg` and generate JSON (surely it understand structured data formats?! Nope; it just wants text. It leaves notes _for itself_ ... in text) instead of whatever it does. Sometimes that works, sometimes it doesn't; it's mysterious. But after the $n$th time of sitting and prompting with, "tell me how to prevent compaction amnesia..." I find myself despairing for a better way. I think I've finally configured it to save plan steps, but hell if I know exactly what I tweaked to do it, or if introducing a comma in a markdown file sometime down the road will break it.

            So yes. It _is_ impressive, but as I keep going through the exercise, (and yes, the code _is_ improving) I cannot help but ask, "is this _really_ how I want to do this?" And time and again, the answer is no: i'm not interested in ever more elaborate mechanisms to help the tool overcome its own limitations; I have no desire to have deep, meaningful conversations with a machine about how I can more usefully speak to it to help it do the things I tell it better. Is this _really_ how I have to spend the rest of my career? If so, I'm quitting and moving to a farm. And I can't help thinking that if I didn't have as much experience as I do, the program would be getting far worse over time; as it is, I have to put a _lot_ of energy into guiding the LLM to do what I want.

            What I _want_ is something far more targeted that augments my ability to work with large bodies of software: scenarios like I'm looking at $this specific code, in an editor, and I want the machine to cross-reference the protocol it's implementing against a known spec; I think there's a deviation in this specific area, pop up a simulation that exercises while varying the following parameters...; or there's a specific code pattern that I want it to identify throughout the code base, and analyze the structure of, and apply some transformation to it; here's a thought about this, cross-reference with existing test patterns; here's the intended behavior, generate an end-to-end test for it and identify areas where that's difficult because of the structure of the code. That kind of thing would be useful. And the LLMs _can_ do a simulacrum of it, but wow is it painful to get them to do it. It takes so much of my time and energy I wonder if it's worth it, let alone actual energy, water, etc. And I can't help but have this really seriously sinking feeling that once the big players start passing on the cost of this machines to the rest of us, only a few players will actually have the capital resources to play. Those that do will have a serious advantage over those that don't, and everyone else will have to pay to play or lose out. That's not good.

            The kicker is, much of that already exists; modern text editors that integrate with LSPs can already rename symbols, or rewrite segments of code, or "extract method" or whatever and have been able to for years. What's new is the natural speech part, but you can't get away from it.

            Anyway, it's still chewing up tokens generating sans-IO state machines for these ancient ham radio protocols; it cost $275 today.

            krismicinski@types.plK This user is from outside of this forum
            krismicinski@types.plK This user is from outside of this forum
            krismicinski@types.pl
            wrote last edited by
            #33

            @cross @shriramk @jfdm @csgordon @lindsey @jeremysiek yes, how stupid it feels to be literally using an extreme number of neurons to infer something that *has a semantics*. I keep telling this to people and the response I get is something like: well, that’s tied up in a ton of other maybe-useful stuff it inferred.

            cross@discuss.systemsC shriramk@mastodon.socialS 2 Replies Last reply
            0
            • krismicinski@types.plK krismicinski@types.pl

              @cross @shriramk @jfdm @csgordon @lindsey @jeremysiek yes, how stupid it feels to be literally using an extreme number of neurons to infer something that *has a semantics*. I keep telling this to people and the response I get is something like: well, that’s tied up in a ton of other maybe-useful stuff it inferred.

              cross@discuss.systemsC This user is from outside of this forum
              cross@discuss.systemsC This user is from outside of this forum
              cross@discuss.systems
              wrote last edited by
              #34

              @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek "but it can give you a great hollandaise recipe!"

              1 Reply Last reply
              0
              • krismicinski@types.plK krismicinski@types.pl

                @shriramk @jfdm @csgordon @lindsey @jeremysiek okay, wow--I did not really expect that. Interesting, I will have to think about that.

                shriramk@mastodon.socialS This user is from outside of this forum
                shriramk@mastodon.socialS This user is from outside of this forum
                shriramk@mastodon.social
                wrote last edited by
                #35

                @krismicinski @jfdm @csgordon @lindsey @jeremysiek
                Literally what we're trying to learn in our course is *how* to do that.
                https://cs.brown.edu/courses/csci1970kf/agentic-spr-2026/

                1 Reply Last reply
                0
                • krismicinski@types.plK krismicinski@types.pl

                  @shriramk @jfdm @csgordon @lindsey @jeremysiek I think once you trust that the student could in principle write the code (and they're treating it like code the prof gave them, code their coworker wrote, etc.) then what you're saying is right. The concern is: "go through whole college career and just have claude code do every single homework assignment with very little intellectual effort." Of course, many would argue that this is a failure of the curriculum design--but it will inevitably take time to catch up.

                  shriramk@mastodon.socialS This user is from outside of this forum
                  shriramk@mastodon.socialS This user is from outside of this forum
                  shriramk@mastodon.social
                  wrote last edited by
                  #36

                  @krismicinski @jfdm @csgordon @lindsey @jeremysiek
                  Yes, that would be a failure of curriculum design. And I'm saying we should be redesigning the curriculum to avoid that failure. (Not the only one, of course.)

                  It's 1960. People have invented compilers and the first real languages. Will you emphasize "but people must know how computers REALLY work", or will you say "shit just got real, let's elevate our learning objectives and figure out anew how to teach?"

                  1 Reply Last reply
                  0
                  • georgweissenbacher@fediscience.orgG georgweissenbacher@fediscience.org

                    @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek we have been claiming for decades that we are not just educating coding monkeys, so it shouldn't really matter that LLMs can now do all the coding. As far as I see it, it's still necessary to identify and clearly formulate verifiable requirements and specifications, come up with a modular design, and verify the whole thing, because I still believe the ultimate responsibilty lies with the developer. So students still need to understand the fundamentals. But yes, it has become much harder to check *at scale* whether they actually grasped them.

                    shriramk@mastodon.socialS This user is from outside of this forum
                    shriramk@mastodon.socialS This user is from outside of this forum
                    shriramk@mastodon.social
                    wrote last edited by
                    #37

                    @GeorgWeissenbacher @krismicinski @jfdm @csgordon @lindsey @jeremysiek
                    Yes to most of that. I think it's not that hard to assess if that is what people were always assessing that.

                    I actually disagree w/ your opening comment. Most intro CS educators will say (and have said), "I don't teach programming, I teach *problem solving*" (whatever the fuck that is). My response is, "great, this should be your liberation! Programming got easy, what are your «problem solving» ideas?"

                    tonyg@pubsub.leastfixedpoint.comT 1 Reply Last reply
                    0
                    • lindsey@recurse.socialL lindsey@recurse.social

                      @cross @krismicinski @shriramk @jfdm @csgordon @jeremysiek It seems like folks sooner or later notice that this whole "the agent only trades in text" thing is Not Great and proceed to reinvent programming languages on top of it. So, you know, when that happens, we PL educators are here to try to help them not accidentally implement dynamic scope or whatever.

                      shriramk@mastodon.socialS This user is from outside of this forum
                      shriramk@mastodon.socialS This user is from outside of this forum
                      shriramk@mastodon.social
                      wrote last edited by
                      #38

                      @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek
                      It is absolutely an open research question as to what will be the new "source" and "intermediate" languages. I think we'll have a much better shot at the latter (richly-typed, semantic specifications as part of code, etc.); for the former, I think we'll build good ones but the trick will be getting people to use them.

                      jschuster@hachyderm.ioJ 1 Reply Last reply
                      0
                      • krismicinski@types.plK krismicinski@types.pl

                        @cross @shriramk @jfdm @csgordon @lindsey @jeremysiek yes, how stupid it feels to be literally using an extreme number of neurons to infer something that *has a semantics*. I keep telling this to people and the response I get is something like: well, that’s tied up in a ton of other maybe-useful stuff it inferred.

                        shriramk@mastodon.socialS This user is from outside of this forum
                        shriramk@mastodon.socialS This user is from outside of this forum
                        shriramk@mastodon.social
                        wrote last edited by
                        #39

                        @krismicinski @cross @jfdm @csgordon @lindsey @jeremysiek
                        If people were taking advantage of that wonderful semantics business all along, @regehr would be out of business (indeed, would never be in business). Semantics has always been a bit of a fig leaf for the PL community.

                        1 Reply Last reply
                        0
                        • cross@discuss.systemsC cross@discuss.systems

                          @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek goodness, I hope that prompting an LLM will not be a *huge* part of software engineering going forward. It's an incredibly inefficient way to go about the task. Frankly, I'm amazed at just how shoddy the current set of tools are.

                          steve@discuss.systemsS This user is from outside of this forum
                          steve@discuss.systemsS This user is from outside of this forum
                          steve@discuss.systems
                          wrote last edited by
                          #40

                          @cross @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek It’s amazingly depressing to me, not because I worry about AI tools making the work obsolete, but because the remarkable effectiveness of AI tools in the face of their shoddiness drives home to me that the vast majority of programmers are redoing a thing that’s already been done most of the time. What a waste of human capital.

                          gwozniak@discuss.systemsG steve@discuss.systemsS 2 Replies Last reply
                          0
                          • shriramk@mastodon.socialS shriramk@mastodon.social

                            @GeorgWeissenbacher @krismicinski @jfdm @csgordon @lindsey @jeremysiek
                            Yes to most of that. I think it's not that hard to assess if that is what people were always assessing that.

                            I actually disagree w/ your opening comment. Most intro CS educators will say (and have said), "I don't teach programming, I teach *problem solving*" (whatever the fuck that is). My response is, "great, this should be your liberation! Programming got easy, what are your «problem solving» ideas?"

                            tonyg@pubsub.leastfixedpoint.comT This user is from outside of this forum
                            tonyg@pubsub.leastfixedpoint.comT This user is from outside of this forum
                            tonyg@pubsub.leastfixedpoint.com
                            wrote last edited by
                            #41

                            @shriramk @GeorgWeissenbacher @krismicinski @jfdm @csgordon @lindsey @jeremysiek ... did programming get easy? Can one be said to be programming if one asks someone else (or an LLM) to write a program for you? Or is some other kind of (not- or not-quite-programming) interaction going on?

                            shriramk@mastodon.socialS 2 Replies Last reply
                            0
                            • steve@discuss.systemsS steve@discuss.systems

                              @cross @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek It’s amazingly depressing to me, not because I worry about AI tools making the work obsolete, but because the remarkable effectiveness of AI tools in the face of their shoddiness drives home to me that the vast majority of programmers are redoing a thing that’s already been done most of the time. What a waste of human capital.

                              gwozniak@discuss.systemsG This user is from outside of this forum
                              gwozniak@discuss.systemsG This user is from outside of this forum
                              gwozniak@discuss.systems
                              wrote last edited by
                              #42

                              @steve @cross @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek As a professional programmer now almost 20 years out from defending a thesis that was about code generation, I can't upvote this enough.

                              LLMs are an incredibly inefficient way to do code reuse.

                              cross@discuss.systemsC 1 Reply Last reply
                              0
                              • gwozniak@discuss.systemsG gwozniak@discuss.systems

                                @steve @cross @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek As a professional programmer now almost 20 years out from defending a thesis that was about code generation, I can't upvote this enough.

                                LLMs are an incredibly inefficient way to do code reuse.

                                cross@discuss.systemsC This user is from outside of this forum
                                cross@discuss.systemsC This user is from outside of this forum
                                cross@discuss.systems
                                wrote last edited by
                                #43

                                @gwozniak @steve @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek yeah, I was going to comment that you had said something similar a few days or a week ago.

                                It's wild to me that we keep writing the same program over and over again and calling it "progress."

                                shriramk@mastodon.socialS 1 Reply Last reply
                                0
                                • shriramk@mastodon.socialS shriramk@mastodon.social

                                  @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek
                                  It is absolutely an open research question as to what will be the new "source" and "intermediate" languages. I think we'll have a much better shot at the latter (richly-typed, semantic specifications as part of code, etc.); for the former, I think we'll build good ones but the trick will be getting people to use them.

                                  jschuster@hachyderm.ioJ This user is from outside of this forum
                                  jschuster@hachyderm.ioJ This user is from outside of this forum
                                  jschuster@hachyderm.io
                                  wrote last edited by
                                  #44

                                  @shriramk @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek Do you think that the "no one will look at the generated code anymore" future is inevitable? Given how often the industry has tried to generate programs directly from English-like specs before and failed, I'm quite skeptical, even if we have notably different tech this time around.

                                  Internally at Google many folks (including high-level ones) are making this claim without evidence, as if it's obvious from its face, and I'm surprised how few people push back on it or at least ask for more proof.

                                  jschuster@hachyderm.ioJ 1 Reply Last reply
                                  0
                                  • jschuster@hachyderm.ioJ jschuster@hachyderm.io

                                    @shriramk @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek Do you think that the "no one will look at the generated code anymore" future is inevitable? Given how often the industry has tried to generate programs directly from English-like specs before and failed, I'm quite skeptical, even if we have notably different tech this time around.

                                    Internally at Google many folks (including high-level ones) are making this claim without evidence, as if it's obvious from its face, and I'm surprised how few people push back on it or at least ask for more proof.

                                    jschuster@hachyderm.ioJ This user is from outside of this forum
                                    jschuster@hachyderm.ioJ This user is from outside of this forum
                                    jschuster@hachyderm.io
                                    wrote last edited by
                                    #45

                                    @shriramk @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek LLM-based code generation reminds me of some of Bret Victor's talks: there are some cool ideas and convincing demos, but also a lot more work to do before one can say "we've solved all of the problems; everyone should be doing this all the time now".

                                    jschuster@hachyderm.ioJ 1 Reply Last reply
                                    0
                                    • jschuster@hachyderm.ioJ jschuster@hachyderm.io

                                      @shriramk @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek LLM-based code generation reminds me of some of Bret Victor's talks: there are some cool ideas and convincing demos, but also a lot more work to do before one can say "we've solved all of the problems; everyone should be doing this all the time now".

                                      jschuster@hachyderm.ioJ This user is from outside of this forum
                                      jschuster@hachyderm.ioJ This user is from outside of this forum
                                      jschuster@hachyderm.io
                                      wrote last edited by
                                      #46

                                      @shriramk @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek To be fair: The LLM tooling is certainly more capable overall than the Bret Victor stuff. But I'm not yet convinced coding is 100% solved.

                                      shriramk@mastodon.socialS 1 Reply Last reply
                                      0
                                      • tonyg@pubsub.leastfixedpoint.comT tonyg@pubsub.leastfixedpoint.com

                                        @shriramk @GeorgWeissenbacher @krismicinski @jfdm @csgordon @lindsey @jeremysiek ... did programming get easy? Can one be said to be programming if one asks someone else (or an LLM) to write a program for you? Or is some other kind of (not- or not-quite-programming) interaction going on?

                                        shriramk@mastodon.socialS This user is from outside of this forum
                                        shriramk@mastodon.socialS This user is from outside of this forum
                                        shriramk@mastodon.social
                                        wrote last edited by
                                        #47

                                        @tonyg @GeorgWeissenbacher @krismicinski @jfdm @csgordon @lindsey @jeremysiek
                                        I very much think of what I'm doing with Claude Code as a kind of programming — indeed, the kind of programming I always wished I could do! But if it makes you happier to use a different term for it (not "vibecoding", that has too many specific connotations and is definitely not how *I'm* doing things), and it's *useful* to have that other term…that's fine by me. I guess my slogan is: "Philosophy…but not too much".

                                        1 Reply Last reply
                                        0
                                        • tonyg@pubsub.leastfixedpoint.comT tonyg@pubsub.leastfixedpoint.com

                                          @shriramk @GeorgWeissenbacher @krismicinski @jfdm @csgordon @lindsey @jeremysiek ... did programming get easy? Can one be said to be programming if one asks someone else (or an LLM) to write a program for you? Or is some other kind of (not- or not-quite-programming) interaction going on?

                                          shriramk@mastodon.socialS This user is from outside of this forum
                                          shriramk@mastodon.socialS This user is from outside of this forum
                                          shriramk@mastodon.social
                                          wrote last edited by
                                          #48

                                          @tonyg Unrelatedly, I was in the UK last month, and when I gave a talk in London (and later in Cambridge), was pleasantly surprised that Noel Walsh stopped in. I was reminiscing about how I'd attended a Racket meetup he organized in Islington back in 2003 or so, where I believe we met!

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups