Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. I realise on the fediverse this is maybe asking for a flaming, but yesterday out of sheer curiosity I tried Claude for a simpleish coding task that I'd been putting off (largely inspired by @hausfath 's latest on #theclimatebrink).

I realise on the fediverse this is maybe asking for a flaming, but yesterday out of sheer curiosity I tried Claude for a simpleish coding task that I'd been putting off (largely inspired by @hausfath 's latest on #theclimatebrink).

Scheduled Pinned Locked Moved Uncategorized
theclimatebrinkaicoding
56 Posts 19 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • benjamingeer@piaille.frB benjamingeer@piaille.fr

    @Ruth_Mottram @UlrikeHahn @hausfath But so far nobody has found evidence of productivity gains in a controlled experiment, not even Anthropic? https://www.anthropic.com/research/AI-assistance-coding-skills One experiment found that LLM coding assistants made developers less productive even though the developers believed it made them more productive. https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

    arnebab@rollenspiel.socialA This user is from outside of this forum
    arnebab@rollenspiel.socialA This user is from outside of this forum
    arnebab@rollenspiel.social
    wrote last edited by
    #19

    @benjamingeer As far as I understand it, the task of @Ruth_Mottram was different from the two examples:

    - not trying to learn a skill
    - not something that’s complex to program, just a time sink (if I understand it correctly)

    And there is something in the text by @hausfath that I’ve also seen from others: a management role, detached from development.

    Like many scientists who do their data evaluation in Excel or sas GUIs (social sciences). And often don’t understand why it works.
    @UlrikeHahn

    ruth_mottram@fediscience.orgR benjamingeer@piaille.frB S 3 Replies Last reply
    0
    • ulrikehahn@fediscience.orgU ulrikehahn@fediscience.org

      @benjamingeer @Ruth_Mottram @hausfath there are massive “productivity gains” everywhere:

      - higher education is demonstrably being undermined by the fact that course work can now be (and is being) completed by AI query
      - science is demonstrably buckling under a deluge of submissions
      - democracy is demonstrably being harmed by AI based astroturfing

      all of these are well-documented. “Controlled experiments” are not the only form of evidence ….

      benjamingeer@piaille.frB This user is from outside of this forum
      benjamingeer@piaille.frB This user is from outside of this forum
      benjamingeer@piaille.fr
      wrote last edited by
      #20

      @UlrikeHahn Is slop productivity? LLMs are good at producing fake course work, fake scientific papers, fake political debates, etc., which can look plausible and often pass for the real thing if you don't look too closely. @Ruth_Mottram @hausfath

      ulrikehahn@fediscience.orgU 1 Reply Last reply
      0
      • arnebab@rollenspiel.socialA arnebab@rollenspiel.social

        @benjamingeer As far as I understand it, the task of @Ruth_Mottram was different from the two examples:

        - not trying to learn a skill
        - not something that’s complex to program, just a time sink (if I understand it correctly)

        And there is something in the text by @hausfath that I’ve also seen from others: a management role, detached from development.

        Like many scientists who do their data evaluation in Excel or sas GUIs (social sciences). And often don’t understand why it works.
        @UlrikeHahn

        ruth_mottram@fediscience.orgR This user is from outside of this forum
        ruth_mottram@fediscience.orgR This user is from outside of this forum
        ruth_mottram@fediscience.org
        wrote last edited by
        #21

        @ArneBab @benjamingeer @hausfath @UlrikeHahn yes that's exactly the kind of task I think the ml models work well on. A lot of science is actually quite boring and repetitive but needs careful monitoring. If a tool can do part of that. Then why not. I think Zeke is correct in that the human mind needs to come up with the creativity and the experiments as well as with careful analysis to understand the results.

        arnebab@rollenspiel.socialA benjamingeer@piaille.frB 2 Replies Last reply
        0
        • arnebab@rollenspiel.socialA arnebab@rollenspiel.social

          @benjamingeer As far as I understand it, the task of @Ruth_Mottram was different from the two examples:

          - not trying to learn a skill
          - not something that’s complex to program, just a time sink (if I understand it correctly)

          And there is something in the text by @hausfath that I’ve also seen from others: a management role, detached from development.

          Like many scientists who do their data evaluation in Excel or sas GUIs (social sciences). And often don’t understand why it works.
          @UlrikeHahn

          benjamingeer@piaille.frB This user is from outside of this forum
          benjamingeer@piaille.frB This user is from outside of this forum
          benjamingeer@piaille.fr
          wrote last edited by
          #22

          @ArneBab It's true that scientists use calculators even though many of them probably don't really know how calculators work. But if you bought a calculator that sometimes said 2 + 2 = 5, you'd return it and get a refund. LLMs are like that.

          LLMs can certainly generate a lot of code very fast. But is it good code, or a mass of spaghetti? Will you be able to maintain it, considering that you don't know how it works? When it turns out to have bugs, will you be able to fix them?

          @Ruth_Mottram @hausfath @UlrikeHahn

          arnebab@rollenspiel.socialA 1 Reply Last reply
          0
          • ruth_mottram@fediscience.orgR ruth_mottram@fediscience.org

            @ArneBab @benjamingeer @hausfath @UlrikeHahn yes that's exactly the kind of task I think the ml models work well on. A lot of science is actually quite boring and repetitive but needs careful monitoring. If a tool can do part of that. Then why not. I think Zeke is correct in that the human mind needs to come up with the creativity and the experiments as well as with careful analysis to understand the results.

            arnebab@rollenspiel.socialA This user is from outside of this forum
            arnebab@rollenspiel.socialA This user is from outside of this forum
            arnebab@rollenspiel.social
            wrote last edited by
            #23

            @Ruth_Mottram The main risk I see with that is that it can quickly limit creativity.

            I experimented with ChatGPT for writing (but didn’t make the results public, except for an experiment explicitly done for evaluation of its effects -- worrying¹), and I found that it is good at providing a start, but repetitive, so that when I started with it, it limited imagination -- kind of like an effect of advertisements. So it’s a bad start.

            ¹ https://www.draketo.de/software/ai-translation-evaluated#completely-changed
            @benjamingeer @hausfath @UlrikeHahn

            arnebab@rollenspiel.socialA 1 Reply Last reply
            0
            • benjamingeer@piaille.frB benjamingeer@piaille.fr

              @UlrikeHahn Is slop productivity? LLMs are good at producing fake course work, fake scientific papers, fake political debates, etc., which can look plausible and often pass for the real thing if you don't look too closely. @Ruth_Mottram @hausfath

              ulrikehahn@fediscience.orgU This user is from outside of this forum
              ulrikehahn@fediscience.orgU This user is from outside of this forum
              ulrikehahn@fediscience.org
              wrote last edited by
              #24

              @benjamingeer @Ruth_Mottram @hausfath I take “productivity” to refer to the efficiency of production of a good or service

              readily available AI systems now can (and do) produce essay answers that I would have to assign a passing grade (and actually an increasingly good grade) given our marking criteria, and it can do that in seconds. It’s a huge problem for higher education.

              How is that not a “productivity gain”?

              I find the conflation of questions about what these systems can actually do (an empirical question!) with questions of desirability deeply counter-productive.

              benjamingeer@piaille.frB 1 Reply Last reply
              0
              • ruth_mottram@fediscience.orgR ruth_mottram@fediscience.org

                @ArneBab @benjamingeer @hausfath @UlrikeHahn yes that's exactly the kind of task I think the ml models work well on. A lot of science is actually quite boring and repetitive but needs careful monitoring. If a tool can do part of that. Then why not. I think Zeke is correct in that the human mind needs to come up with the creativity and the experiments as well as with careful analysis to understand the results.

                benjamingeer@piaille.frB This user is from outside of this forum
                benjamingeer@piaille.frB This user is from outside of this forum
                benjamingeer@piaille.fr
                wrote last edited by
                #25

                @Ruth_Mottram The risk is in the "careful monitoring" part: https://mastodon.online/@pseudonym/116135917950981989 @ArneBab @hausfath @UlrikeHahn

                1 Reply Last reply
                0
                • arnebab@rollenspiel.socialA arnebab@rollenspiel.social

                  @Ruth_Mottram The main risk I see with that is that it can quickly limit creativity.

                  I experimented with ChatGPT for writing (but didn’t make the results public, except for an experiment explicitly done for evaluation of its effects -- worrying¹), and I found that it is good at providing a start, but repetitive, so that when I started with it, it limited imagination -- kind of like an effect of advertisements. So it’s a bad start.

                  ¹ https://www.draketo.de/software/ai-translation-evaluated#completely-changed
                  @benjamingeer @hausfath @UlrikeHahn

                  arnebab@rollenspiel.socialA This user is from outside of this forum
                  arnebab@rollenspiel.socialA This user is from outside of this forum
                  arnebab@rollenspiel.social
                  wrote last edited by
                  #26

                  @Ruth_Mottram One experiment I did was to turn a text I wrote years ago into a scientific paper in economics.

                  It took two hours and reached a quality that I (physicist, not from economics) could not have distinguished it from a real paper.

                  AI causes the form to be easier to repeat, so we can no longer trust the form of scientific writing to be a hint that people actually have scientific education.

                  And that is a huge risk.
                  @benjamingeer @hausfath @UlrikeHahn

                  arnebab@rollenspiel.socialA 2 Replies Last reply
                  0
                  • ulrikehahn@fediscience.orgU ulrikehahn@fediscience.org

                    @benjamingeer @Ruth_Mottram @hausfath I take “productivity” to refer to the efficiency of production of a good or service

                    readily available AI systems now can (and do) produce essay answers that I would have to assign a passing grade (and actually an increasingly good grade) given our marking criteria, and it can do that in seconds. It’s a huge problem for higher education.

                    How is that not a “productivity gain”?

                    I find the conflation of questions about what these systems can actually do (an empirical question!) with questions of desirability deeply counter-productive.

                    benjamingeer@piaille.frB This user is from outside of this forum
                    benjamingeer@piaille.frB This user is from outside of this forum
                    benjamingeer@piaille.fr
                    wrote last edited by
                    #27

                    @UlrikeHahn The real productivity that you're asking your students for is their own thinking and learning, right? LLMs aren't producing that, they're producing fake evidence for it, by parroting sentences that were written by people who had actually done the thinking and learning. The problem for higher education is now to figure out how to measure thinking and learning in other ways. @Ruth_Mottram @hausfath

                    ulrikehahn@fediscience.orgU 1 Reply Last reply
                    0
                    • benjamingeer@piaille.frB benjamingeer@piaille.fr

                      @UlrikeHahn The real productivity that you're asking your students for is their own thinking and learning, right? LLMs aren't producing that, they're producing fake evidence for it, by parroting sentences that were written by people who had actually done the thinking and learning. The problem for higher education is now to figure out how to measure thinking and learning in other ways. @Ruth_Mottram @hausfath

                      ulrikehahn@fediscience.orgU This user is from outside of this forum
                      ulrikehahn@fediscience.orgU This user is from outside of this forum
                      ulrikehahn@fediscience.org
                      wrote last edited by
                      #28

                      @benjamingeer @Ruth_Mottram @hausfath sometimes replies here leave me speechless…

                      benjamingeer@piaille.frB 1 Reply Last reply
                      0
                      • benjamingeer@piaille.frB benjamingeer@piaille.fr

                        @ArneBab It's true that scientists use calculators even though many of them probably don't really know how calculators work. But if you bought a calculator that sometimes said 2 + 2 = 5, you'd return it and get a refund. LLMs are like that.

                        LLMs can certainly generate a lot of code very fast. But is it good code, or a mass of spaghetti? Will you be able to maintain it, considering that you don't know how it works? When it turns out to have bugs, will you be able to fix them?

                        @Ruth_Mottram @hausfath @UlrikeHahn

                        arnebab@rollenspiel.socialA This user is from outside of this forum
                        arnebab@rollenspiel.socialA This user is from outside of this forum
                        arnebab@rollenspiel.social
                        wrote last edited by
                        #29

                        @benjamingeer scientific code is usually a mass of spaghetti.

                        I once made a data cleanup program of a colleague at least 100x faster by just processing the data in one go instead of opening it again and seeking to the last position for each single line.

                        You need to know where you come from to check whether something brings benefits.

                        That said: if that had been a 10k lines AI code monster, I couldn’t have fixed it in the 30 minutes I had.

                        @Ruth_Mottram @hausfath @UlrikeHahn

                        arnebab@rollenspiel.socialA 1 Reply Last reply
                        0
                        • arnebab@rollenspiel.socialA arnebab@rollenspiel.social

                          @benjamingeer scientific code is usually a mass of spaghetti.

                          I once made a data cleanup program of a colleague at least 100x faster by just processing the data in one go instead of opening it again and seeking to the last position for each single line.

                          You need to know where you come from to check whether something brings benefits.

                          That said: if that had been a 10k lines AI code monster, I couldn’t have fixed it in the 30 minutes I had.

                          @Ruth_Mottram @hausfath @UlrikeHahn

                          arnebab@rollenspiel.socialA This user is from outside of this forum
                          arnebab@rollenspiel.socialA This user is from outside of this forum
                          arnebab@rollenspiel.social
                          wrote last edited by
                          #30

                          @benjamingeer But, just to make it clear: that code which was 100x slower than it could have been, was still correct.

                          It was slow, but it did very complex tasks correctly.
                          @Ruth_Mottram @hausfath @UlrikeHahn

                          arnebab@rollenspiel.socialA 1 Reply Last reply
                          0
                          • arnebab@rollenspiel.socialA arnebab@rollenspiel.social

                            @Ruth_Mottram One experiment I did was to turn a text I wrote years ago into a scientific paper in economics.

                            It took two hours and reached a quality that I (physicist, not from economics) could not have distinguished it from a real paper.

                            AI causes the form to be easier to repeat, so we can no longer trust the form of scientific writing to be a hint that people actually have scientific education.

                            And that is a huge risk.
                            @benjamingeer @hausfath @UlrikeHahn

                            arnebab@rollenspiel.socialA This user is from outside of this forum
                            arnebab@rollenspiel.socialA This user is from outside of this forum
                            arnebab@rollenspiel.social
                            wrote last edited by
                            #31

                            @Ruth_Mottram though my main gripe with us as human society is that we’re spending more than 400 billion dollars a year to build error-prone general pattern recognition and reproduction while finding maybe 100 problems where it brings big benefits -- that would each require less than 10 million dollars to solve.

                            Why don’t we have solutions for those tasks already?

                            Why is matplotlib mostly written by some folks in their spare time while it has tons of value?
                            @benjamingeer @hausfath @UlrikeHahn

                            1 Reply Last reply
                            0
                            • ulrikehahn@fediscience.orgU ulrikehahn@fediscience.org

                              @benjamingeer @Ruth_Mottram @hausfath sometimes replies here leave me speechless…

                              benjamingeer@piaille.frB This user is from outside of this forum
                              benjamingeer@piaille.frB This user is from outside of this forum
                              benjamingeer@piaille.fr
                              wrote last edited by
                              #32

                              @UlrikeHahn What is the "good" that you want your students to produce? The thing that has real value? Is it essays or learning? Perhaps students are using LLMs to write essays because they mistakenly believe that the essay is an end in itself, rather than a means to an end. As somebody said, sometimes it makes sense to have someone cook your meal for you, but it never makes sense to have someone eat your meal for you. @Ruth_Mottram @hausfath

                              ulrikehahn@fediscience.orgU 1 Reply Last reply
                              0
                              • ruth_mottram@fediscience.orgR ruth_mottram@fediscience.org

                                I realise on the fediverse this is maybe asking for a flaming, but yesterday out of sheer curiosity I tried Claude for a simpleish coding task that I'd been putting off (largely inspired by @hausfath 's latest on #theclimatebrink). The performance of Claude was seriously impressive. I am convinced the AI cycle is more than hype (and have been for a while), the chatbots have been a huge attention hogger, misleadingly so, while the serious work has been done elsewhere. (We are developing ML tools to supplement parts of our climate model workflows).

                                Now I'm wondering if there is any serious EU competition to Anthropic? - Mistral's codestral perhaps?
                                Because this kind of performance changes everything and we can't afford to lag behind...
                                #AIcoding #ML

                                Edit: here is the climate brink post I mentioned

                                Link Preview Image
                                The AI-Augmented Scientist

                                The promise and pitfalls of using AI tools to boost my capabilities as a scientist

                                favicon

                                (www.theclimatebrink.com)

                                karolina@fediscience.orgK This user is from outside of this forum
                                karolina@fediscience.orgK This user is from outside of this forum
                                karolina@fediscience.org
                                wrote last edited by
                                #33

                                Do people actually read the code Claude runs and how it differs from what Claude gives as an output?

                                1 Reply Last reply
                                0
                                • benjamingeer@piaille.frB benjamingeer@piaille.fr

                                  @UlrikeHahn What is the "good" that you want your students to produce? The thing that has real value? Is it essays or learning? Perhaps students are using LLMs to write essays because they mistakenly believe that the essay is an end in itself, rather than a means to an end. As somebody said, sometimes it makes sense to have someone cook your meal for you, but it never makes sense to have someone eat your meal for you. @Ruth_Mottram @hausfath

                                  ulrikehahn@fediscience.orgU This user is from outside of this forum
                                  ulrikehahn@fediscience.orgU This user is from outside of this forum
                                  ulrikehahn@fediscience.org
                                  wrote last edited by
                                  #34

                                  @benjamingeer @Ruth_Mottram @hausfath Benjamin, maybe just reread the previous post of yours and ask yourself “what in this post am I saying that could possibly be new to the person I am addressing?”…and then see where that leads you

                                  benjamingeer@piaille.frB 1 Reply Last reply
                                  0
                                  • ulrikehahn@fediscience.orgU ulrikehahn@fediscience.org

                                    @benjamingeer @Ruth_Mottram @hausfath Benjamin, maybe just reread the previous post of yours and ask yourself “what in this post am I saying that could possibly be new to the person I am addressing?”…and then see where that leads you

                                    benjamingeer@piaille.frB This user is from outside of this forum
                                    benjamingeer@piaille.frB This user is from outside of this forum
                                    benjamingeer@piaille.fr
                                    wrote last edited by
                                    #35

                                    @UlrikeHahn It would surprise me if anything I said was new to you. What surprised me was that you described the production of counterfeit goods as productivity. @Ruth_Mottram @hausfath

                                    ulrikehahn@fediscience.orgU 1 Reply Last reply
                                    0
                                    • benjamingeer@piaille.frB benjamingeer@piaille.fr

                                      @UlrikeHahn It would surprise me if anything I said was new to you. What surprised me was that you described the production of counterfeit goods as productivity. @Ruth_Mottram @hausfath

                                      ulrikehahn@fediscience.orgU This user is from outside of this forum
                                      ulrikehahn@fediscience.orgU This user is from outside of this forum
                                      ulrikehahn@fediscience.org
                                      wrote last edited by
                                      #36

                                      @benjamingeer @Ruth_Mottram @hausfath maybe that should be a clue that you are somehow missing the intended point?

                                      benjamingeer@piaille.frB 1 Reply Last reply
                                      0
                                      • ulrikehahn@fediscience.orgU ulrikehahn@fediscience.org

                                        @benjamingeer @Ruth_Mottram @hausfath maybe that should be a clue that you are somehow missing the intended point?

                                        benjamingeer@piaille.frB This user is from outside of this forum
                                        benjamingeer@piaille.frB This user is from outside of this forum
                                        benjamingeer@piaille.fr
                                        wrote last edited by
                                        #37

                                        @UlrikeHahn The original question was whether LLM coding assistants would make scientists more productive. It sounded like you were arguing that they would, since LLMs are not just hype, as evidenced by their efficiency in producing fake course work, etc. Were you being ironic? @Ruth_Mottram @hausfath

                                        ulrikehahn@fediscience.orgU 1 Reply Last reply
                                        0
                                        • arnebab@rollenspiel.socialA arnebab@rollenspiel.social

                                          @Ruth_Mottram One experiment I did was to turn a text I wrote years ago into a scientific paper in economics.

                                          It took two hours and reached a quality that I (physicist, not from economics) could not have distinguished it from a real paper.

                                          AI causes the form to be easier to repeat, so we can no longer trust the form of scientific writing to be a hint that people actually have scientific education.

                                          And that is a huge risk.
                                          @benjamingeer @hausfath @UlrikeHahn

                                          arnebab@rollenspiel.socialA This user is from outside of this forum
                                          arnebab@rollenspiel.socialA This user is from outside of this forum
                                          arnebab@rollenspiel.social
                                          wrote last edited by
                                          #38

                                          @Ruth_Mottram when you use AI to transform your content from one form to another, parts of the content usually associated with the target form creep into your content.

                                          This can be as bad as turning "agriculture that needs less antibiotics, because animals stay healthier" into "agriculture without antibiotics" (so sick animals suffer needlessly).

                                          Because AI does not differentiate between content and form.
                                          @benjamingeer @hausfath @UlrikeHahn

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups