Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Welp, for the first semester ever, SOTA LLMs can do *every single assignment, from scratch (readmes, etc.), and get 100%*.

Welp, for the first semester ever, SOTA LLMs can do *every single assignment, from scratch (readmes, etc.), and get 100%*.

Scheduled Pinned Locked Moved Uncategorized
52 Posts 17 Posters 86 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • krismicinski@types.plK krismicinski@types.pl

    @shriramk @jfdm @csgordon @lindsey @jeremysiek I think once you trust that the student could in principle write the code (and they're treating it like code the prof gave them, code their coworker wrote, etc.) then what you're saying is right. The concern is: "go through whole college career and just have claude code do every single homework assignment with very little intellectual effort." Of course, many would argue that this is a failure of the curriculum design--but it will inevitably take time to catch up.

    shriramk@mastodon.socialS This user is from outside of this forum
    shriramk@mastodon.socialS This user is from outside of this forum
    shriramk@mastodon.social
    wrote last edited by
    #36

    @krismicinski @jfdm @csgordon @lindsey @jeremysiek
    Yes, that would be a failure of curriculum design. And I'm saying we should be redesigning the curriculum to avoid that failure. (Not the only one, of course.)

    It's 1960. People have invented compilers and the first real languages. Will you emphasize "but people must know how computers REALLY work", or will you say "shit just got real, let's elevate our learning objectives and figure out anew how to teach?"

    1 Reply Last reply
    0
    • georgweissenbacher@fediscience.orgG georgweissenbacher@fediscience.org

      @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek we have been claiming for decades that we are not just educating coding monkeys, so it shouldn't really matter that LLMs can now do all the coding. As far as I see it, it's still necessary to identify and clearly formulate verifiable requirements and specifications, come up with a modular design, and verify the whole thing, because I still believe the ultimate responsibilty lies with the developer. So students still need to understand the fundamentals. But yes, it has become much harder to check *at scale* whether they actually grasped them.

      shriramk@mastodon.socialS This user is from outside of this forum
      shriramk@mastodon.socialS This user is from outside of this forum
      shriramk@mastodon.social
      wrote last edited by
      #37

      @GeorgWeissenbacher @krismicinski @jfdm @csgordon @lindsey @jeremysiek
      Yes to most of that. I think it's not that hard to assess if that is what people were always assessing that.

      I actually disagree w/ your opening comment. Most intro CS educators will say (and have said), "I don't teach programming, I teach *problem solving*" (whatever the fuck that is). My response is, "great, this should be your liberation! Programming got easy, what are your «problem solving» ideas?"

      tonyg@pubsub.leastfixedpoint.comT 1 Reply Last reply
      0
      • lindsey@recurse.socialL lindsey@recurse.social

        @cross @krismicinski @shriramk @jfdm @csgordon @jeremysiek It seems like folks sooner or later notice that this whole "the agent only trades in text" thing is Not Great and proceed to reinvent programming languages on top of it. So, you know, when that happens, we PL educators are here to try to help them not accidentally implement dynamic scope or whatever.

        shriramk@mastodon.socialS This user is from outside of this forum
        shriramk@mastodon.socialS This user is from outside of this forum
        shriramk@mastodon.social
        wrote last edited by
        #38

        @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek
        It is absolutely an open research question as to what will be the new "source" and "intermediate" languages. I think we'll have a much better shot at the latter (richly-typed, semantic specifications as part of code, etc.); for the former, I think we'll build good ones but the trick will be getting people to use them.

        jschuster@hachyderm.ioJ 1 Reply Last reply
        0
        • krismicinski@types.plK krismicinski@types.pl

          @cross @shriramk @jfdm @csgordon @lindsey @jeremysiek yes, how stupid it feels to be literally using an extreme number of neurons to infer something that *has a semantics*. I keep telling this to people and the response I get is something like: well, that’s tied up in a ton of other maybe-useful stuff it inferred.

          shriramk@mastodon.socialS This user is from outside of this forum
          shriramk@mastodon.socialS This user is from outside of this forum
          shriramk@mastodon.social
          wrote last edited by
          #39

          @krismicinski @cross @jfdm @csgordon @lindsey @jeremysiek
          If people were taking advantage of that wonderful semantics business all along, @regehr would be out of business (indeed, would never be in business). Semantics has always been a bit of a fig leaf for the PL community.

          1 Reply Last reply
          0
          • cross@discuss.systemsC cross@discuss.systems

            @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek goodness, I hope that prompting an LLM will not be a *huge* part of software engineering going forward. It's an incredibly inefficient way to go about the task. Frankly, I'm amazed at just how shoddy the current set of tools are.

            steve@discuss.systemsS This user is from outside of this forum
            steve@discuss.systemsS This user is from outside of this forum
            steve@discuss.systems
            wrote last edited by
            #40

            @cross @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek It’s amazingly depressing to me, not because I worry about AI tools making the work obsolete, but because the remarkable effectiveness of AI tools in the face of their shoddiness drives home to me that the vast majority of programmers are redoing a thing that’s already been done most of the time. What a waste of human capital.

            gwozniak@discuss.systemsG steve@discuss.systemsS 2 Replies Last reply
            0
            • shriramk@mastodon.socialS shriramk@mastodon.social

              @GeorgWeissenbacher @krismicinski @jfdm @csgordon @lindsey @jeremysiek
              Yes to most of that. I think it's not that hard to assess if that is what people were always assessing that.

              I actually disagree w/ your opening comment. Most intro CS educators will say (and have said), "I don't teach programming, I teach *problem solving*" (whatever the fuck that is). My response is, "great, this should be your liberation! Programming got easy, what are your «problem solving» ideas?"

              tonyg@pubsub.leastfixedpoint.comT This user is from outside of this forum
              tonyg@pubsub.leastfixedpoint.comT This user is from outside of this forum
              tonyg@pubsub.leastfixedpoint.com
              wrote last edited by
              #41

              @shriramk @GeorgWeissenbacher @krismicinski @jfdm @csgordon @lindsey @jeremysiek ... did programming get easy? Can one be said to be programming if one asks someone else (or an LLM) to write a program for you? Or is some other kind of (not- or not-quite-programming) interaction going on?

              shriramk@mastodon.socialS 2 Replies Last reply
              0
              • steve@discuss.systemsS steve@discuss.systems

                @cross @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek It’s amazingly depressing to me, not because I worry about AI tools making the work obsolete, but because the remarkable effectiveness of AI tools in the face of their shoddiness drives home to me that the vast majority of programmers are redoing a thing that’s already been done most of the time. What a waste of human capital.

                gwozniak@discuss.systemsG This user is from outside of this forum
                gwozniak@discuss.systemsG This user is from outside of this forum
                gwozniak@discuss.systems
                wrote last edited by
                #42

                @steve @cross @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek As a professional programmer now almost 20 years out from defending a thesis that was about code generation, I can't upvote this enough.

                LLMs are an incredibly inefficient way to do code reuse.

                cross@discuss.systemsC 1 Reply Last reply
                0
                • gwozniak@discuss.systemsG gwozniak@discuss.systems

                  @steve @cross @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek As a professional programmer now almost 20 years out from defending a thesis that was about code generation, I can't upvote this enough.

                  LLMs are an incredibly inefficient way to do code reuse.

                  cross@discuss.systemsC This user is from outside of this forum
                  cross@discuss.systemsC This user is from outside of this forum
                  cross@discuss.systems
                  wrote last edited by
                  #43

                  @gwozniak @steve @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek yeah, I was going to comment that you had said something similar a few days or a week ago.

                  It's wild to me that we keep writing the same program over and over again and calling it "progress."

                  shriramk@mastodon.socialS 1 Reply Last reply
                  0
                  • shriramk@mastodon.socialS shriramk@mastodon.social

                    @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek
                    It is absolutely an open research question as to what will be the new "source" and "intermediate" languages. I think we'll have a much better shot at the latter (richly-typed, semantic specifications as part of code, etc.); for the former, I think we'll build good ones but the trick will be getting people to use them.

                    jschuster@hachyderm.ioJ This user is from outside of this forum
                    jschuster@hachyderm.ioJ This user is from outside of this forum
                    jschuster@hachyderm.io
                    wrote last edited by
                    #44

                    @shriramk @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek Do you think that the "no one will look at the generated code anymore" future is inevitable? Given how often the industry has tried to generate programs directly from English-like specs before and failed, I'm quite skeptical, even if we have notably different tech this time around.

                    Internally at Google many folks (including high-level ones) are making this claim without evidence, as if it's obvious from its face, and I'm surprised how few people push back on it or at least ask for more proof.

                    jschuster@hachyderm.ioJ 1 Reply Last reply
                    0
                    • jschuster@hachyderm.ioJ jschuster@hachyderm.io

                      @shriramk @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek Do you think that the "no one will look at the generated code anymore" future is inevitable? Given how often the industry has tried to generate programs directly from English-like specs before and failed, I'm quite skeptical, even if we have notably different tech this time around.

                      Internally at Google many folks (including high-level ones) are making this claim without evidence, as if it's obvious from its face, and I'm surprised how few people push back on it or at least ask for more proof.

                      jschuster@hachyderm.ioJ This user is from outside of this forum
                      jschuster@hachyderm.ioJ This user is from outside of this forum
                      jschuster@hachyderm.io
                      wrote last edited by
                      #45

                      @shriramk @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek LLM-based code generation reminds me of some of Bret Victor's talks: there are some cool ideas and convincing demos, but also a lot more work to do before one can say "we've solved all of the problems; everyone should be doing this all the time now".

                      jschuster@hachyderm.ioJ 1 Reply Last reply
                      0
                      • jschuster@hachyderm.ioJ jschuster@hachyderm.io

                        @shriramk @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek LLM-based code generation reminds me of some of Bret Victor's talks: there are some cool ideas and convincing demos, but also a lot more work to do before one can say "we've solved all of the problems; everyone should be doing this all the time now".

                        jschuster@hachyderm.ioJ This user is from outside of this forum
                        jschuster@hachyderm.ioJ This user is from outside of this forum
                        jschuster@hachyderm.io
                        wrote last edited by
                        #46

                        @shriramk @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek To be fair: The LLM tooling is certainly more capable overall than the Bret Victor stuff. But I'm not yet convinced coding is 100% solved.

                        shriramk@mastodon.socialS 1 Reply Last reply
                        0
                        • tonyg@pubsub.leastfixedpoint.comT tonyg@pubsub.leastfixedpoint.com

                          @shriramk @GeorgWeissenbacher @krismicinski @jfdm @csgordon @lindsey @jeremysiek ... did programming get easy? Can one be said to be programming if one asks someone else (or an LLM) to write a program for you? Or is some other kind of (not- or not-quite-programming) interaction going on?

                          shriramk@mastodon.socialS This user is from outside of this forum
                          shriramk@mastodon.socialS This user is from outside of this forum
                          shriramk@mastodon.social
                          wrote last edited by
                          #47

                          @tonyg @GeorgWeissenbacher @krismicinski @jfdm @csgordon @lindsey @jeremysiek
                          I very much think of what I'm doing with Claude Code as a kind of programming — indeed, the kind of programming I always wished I could do! But if it makes you happier to use a different term for it (not "vibecoding", that has too many specific connotations and is definitely not how *I'm* doing things), and it's *useful* to have that other term…that's fine by me. I guess my slogan is: "Philosophy…but not too much".

                          1 Reply Last reply
                          0
                          • tonyg@pubsub.leastfixedpoint.comT tonyg@pubsub.leastfixedpoint.com

                            @shriramk @GeorgWeissenbacher @krismicinski @jfdm @csgordon @lindsey @jeremysiek ... did programming get easy? Can one be said to be programming if one asks someone else (or an LLM) to write a program for you? Or is some other kind of (not- or not-quite-programming) interaction going on?

                            shriramk@mastodon.socialS This user is from outside of this forum
                            shriramk@mastodon.socialS This user is from outside of this forum
                            shriramk@mastodon.social
                            wrote last edited by
                            #48

                            @tonyg Unrelatedly, I was in the UK last month, and when I gave a talk in London (and later in Cambridge), was pleasantly surprised that Noel Walsh stopped in. I was reminiscing about how I'd attended a Racket meetup he organized in Islington back in 2003 or so, where I believe we met!

                            1 Reply Last reply
                            0
                            • cross@discuss.systemsC cross@discuss.systems

                              @gwozniak @steve @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek yeah, I was going to comment that you had said something similar a few days or a week ago.

                              It's wild to me that we keep writing the same program over and over again and calling it "progress."

                              shriramk@mastodon.socialS This user is from outside of this forum
                              shriramk@mastodon.socialS This user is from outside of this forum
                              shriramk@mastodon.social
                              wrote last edited by
                              #49

                              @cross @gwozniak @steve @krismicinski @jfdm @csgordon @lindsey @jeremysiek
                              We were already doing this. The entire low-code/no-code movement was all about making sure we stopped doing that in some domains, though we built 20+ systems that each tried to do that. (-:

                              1 Reply Last reply
                              0
                              • jschuster@hachyderm.ioJ jschuster@hachyderm.io

                                @shriramk @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek To be fair: The LLM tooling is certainly more capable overall than the Bret Victor stuff. But I'm not yet convinced coding is 100% solved.

                                shriramk@mastodon.socialS This user is from outside of this forum
                                shriramk@mastodon.socialS This user is from outside of this forum
                                shriramk@mastodon.social
                                wrote last edited by
                                #50

                                @jschuster @lindsey @cross @krismicinski @jfdm @csgordon @jeremysiek
                                Of course I don't think we'll never need to ever look at generated code again; that would be a foolish position. The interesting question is how much will people need to, and relative to what? If you have an amazing test suite or rich verified properties, for instance, how much do you need to review code? Most people aren't writing Dan Cross-level code. (The Bret Victor analogy is good.)

                                1 Reply Last reply
                                0
                                • shriramk@mastodon.socialS shriramk@mastodon.social

                                  @jfdm @csgordon @lindsey @jeremysiek @krismicinski
                                  "we are doomed" is an incredibly disappointing take. You should have come to my "GenAI and CS Ed" talk (-:.

                                  If our only value-add was "my course was gated behind a needlessly difficult thing", that doesn't say much for the value of our courses.

                                  ltratt@mastodon.socialL This user is from outside of this forum
                                  ltratt@mastodon.socialL This user is from outside of this forum
                                  ltratt@mastodon.social
                                  wrote last edited by
                                  #51

                                  @shriramk @jfdm @csgordon @lindsey @jeremysiek @krismicinski A challenge -- which I think we're seeing from many people in this thread! -- is that in recent years we (speaking broadly) have magnified Certification as an outcome relative Education _and_ conflated the two together in ours and, often, students' minds. New technology might have undermined how we do Certification right now; perhaps that will also encourage us to change how we do Education?

                                  1 Reply Last reply
                                  0
                                  • steve@discuss.systemsS steve@discuss.systems

                                    @cross @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek It’s amazingly depressing to me, not because I worry about AI tools making the work obsolete, but because the remarkable effectiveness of AI tools in the face of their shoddiness drives home to me that the vast majority of programmers are redoing a thing that’s already been done most of the time. What a waste of human capital.

                                    steve@discuss.systemsS This user is from outside of this forum
                                    steve@discuss.systemsS This user is from outside of this forum
                                    steve@discuss.systems
                                    wrote last edited by
                                    #52

                                    @cross @krismicinski @shriramk @jfdm @csgordon @lindsey @jeremysiek (with AI tools, they are still redoing a thing that they shouldn’t need to, but faster and sloppier.)

                                    1 Reply Last reply
                                    0
                                    Reply
                                    • Reply as topic
                                    Log in to reply
                                    • Oldest to Newest
                                    • Newest to Oldest
                                    • Most Votes


                                    • Login

                                    • Login or register to search.
                                    • First post
                                      Last post
                                    0
                                    • Categories
                                    • Recent
                                    • Tags
                                    • Popular
                                    • World
                                    • Users
                                    • Groups