Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. I realise on the fediverse this is maybe asking for a flaming, but yesterday out of sheer curiosity I tried Claude for a simpleish coding task that I'd been putting off (largely inspired by @hausfath 's latest on #theclimatebrink).

I realise on the fediverse this is maybe asking for a flaming, but yesterday out of sheer curiosity I tried Claude for a simpleish coding task that I'd been putting off (largely inspired by @hausfath 's latest on #theclimatebrink).

Scheduled Pinned Locked Moved Uncategorized
theclimatebrinkaicoding
56 Posts 19 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • arnebab@rollenspiel.socialA arnebab@rollenspiel.social

    @benjamingeer But, just to make it clear: that code which was 100x slower than it could have been, was still correct.

    It was slow, but it did very complex tasks correctly.
    @Ruth_Mottram @hausfath @UlrikeHahn

    arnebab@rollenspiel.socialA This user is from outside of this forum
    arnebab@rollenspiel.socialA This user is from outside of this forum
    arnebab@rollenspiel.social
    wrote last edited by
    #44

    @benjamingeer Therefore I’d rather compare LLMs to using statistical methods without understanding them.

    That’s already widespread and I expect that with LLMs it will get worse.
    @Ruth_Mottram @hausfath @UlrikeHahn

    1 Reply Last reply
    0
    • ruth_mottram@fediscience.orgR ruth_mottram@fediscience.org

      I realise on the fediverse this is maybe asking for a flaming, but yesterday out of sheer curiosity I tried Claude for a simpleish coding task that I'd been putting off (largely inspired by @hausfath 's latest on #theclimatebrink). The performance of Claude was seriously impressive. I am convinced the AI cycle is more than hype (and have been for a while), the chatbots have been a huge attention hogger, misleadingly so, while the serious work has been done elsewhere. (We are developing ML tools to supplement parts of our climate model workflows).

      Now I'm wondering if there is any serious EU competition to Anthropic? - Mistral's codestral perhaps?
      Because this kind of performance changes everything and we can't afford to lag behind...
      #AIcoding #ML

      Edit: here is the climate brink post I mentioned

      Link Preview Image
      The AI-Augmented Scientist

      The promise and pitfalls of using AI tools to boost my capabilities as a scientist

      favicon

      (www.theclimatebrink.com)

      1 This user is from outside of this forum
      1 This user is from outside of this forum
      1337@techhub.social
      wrote last edited by
      #45

      @Ruth_Mottram @hausfath This seems like a *really* bad idea. I'm a software engineer and not a scientist, but I believe I've heard there's already a fairly big problem in the sciences with software bugs producing misleading results. I imagine using AI to write code could make this much worse. IMO, the extra time that would've been spent coding everything would not have been wasted. Coding it yourself gives you more time to think about what you're typing and gain a more complete understanding of your code and the libraries you're using; giving you more time and insight to spot bugs or otherwise wrong or less than optimal ways of doing things. If one did a thorough review of the AI generated code to ensure it was correct, I'd guess it take at least the same amount of time. Furthermore, seeing the AI generated code first would create "anchoring bias," possibly still resulting in code with more bugs.

      arnebab@rollenspiel.socialA 1 Reply Last reply
      0
      • padjo@mastodon.ieP padjo@mastodon.ie

        @Ruth_Mottram @hausfath I had the same experience yesterday. I built a workout tracking app I've been thinking about building for a year. It took about 5 hours of fairly low effort prompting to go from concept to deployed.

        Previously this would have been at least a week of full-time high-intensity work. I would probably never would have had the time to do it as a result. These models have fundamentally changed the economics of building software, it's just undeniable at this stage.

        arnebab@rollenspiel.socialA This user is from outside of this forum
        arnebab@rollenspiel.socialA This user is from outside of this forum
        arnebab@rollenspiel.social
        wrote last edited by
        #46

        @Padjo the core question is: for which tasks does this work reliably?

        Did you review the code to ensure that it doesn’t have unintended side-effects?

        (that’s the difference between having an auto-complete that works on abstract concepts and negligently releasing potentially dangerous products to the public)

        ⇒ the fast part is only for the prototyping stage.
        @Ruth_Mottram @hausfath

        padjo@mastodon.ieP 1 Reply Last reply
        0
        • 1 1337@techhub.social

          @Ruth_Mottram @hausfath This seems like a *really* bad idea. I'm a software engineer and not a scientist, but I believe I've heard there's already a fairly big problem in the sciences with software bugs producing misleading results. I imagine using AI to write code could make this much worse. IMO, the extra time that would've been spent coding everything would not have been wasted. Coding it yourself gives you more time to think about what you're typing and gain a more complete understanding of your code and the libraries you're using; giving you more time and insight to spot bugs or otherwise wrong or less than optimal ways of doing things. If one did a thorough review of the AI generated code to ensure it was correct, I'd guess it take at least the same amount of time. Furthermore, seeing the AI generated code first would create "anchoring bias," possibly still resulting in code with more bugs.

          arnebab@rollenspiel.socialA This user is from outside of this forum
          arnebab@rollenspiel.socialA This user is from outside of this forum
          arnebab@rollenspiel.social
          wrote last edited by
          #47

          @1337 "anchoring bias" is a formulation I searched for.

          Thank you!

          That anchoring bias is why Larian finally decided not to let their concept artists use AI generated props for inspiration.
          @Ruth_Mottram @hausfath

          1 Reply Last reply
          0
          • ruth_mottram@fediscience.orgR ruth_mottram@fediscience.org

            I realise on the fediverse this is maybe asking for a flaming, but yesterday out of sheer curiosity I tried Claude for a simpleish coding task that I'd been putting off (largely inspired by @hausfath 's latest on #theclimatebrink). The performance of Claude was seriously impressive. I am convinced the AI cycle is more than hype (and have been for a while), the chatbots have been a huge attention hogger, misleadingly so, while the serious work has been done elsewhere. (We are developing ML tools to supplement parts of our climate model workflows).

            Now I'm wondering if there is any serious EU competition to Anthropic? - Mistral's codestral perhaps?
            Because this kind of performance changes everything and we can't afford to lag behind...
            #AIcoding #ML

            Edit: here is the climate brink post I mentioned

            Link Preview Image
            The AI-Augmented Scientist

            The promise and pitfalls of using AI tools to boost my capabilities as a scientist

            favicon

            (www.theclimatebrink.com)

            yvandasilva@hachyderm.ioY This user is from outside of this forum
            yvandasilva@hachyderm.ioY This user is from outside of this forum
            yvandasilva@hachyderm.io
            wrote last edited by
            #48

            @Ruth_Mottram @hausfath it's okay for one shot little scripts.
            Which most data science is.

            For long term projects that you need to maintain that grow to thousands or millions of lines that need to live long term and be maintained it's not ok.
            It adds too much tech debt too quickly.

            Writing code was never the problem tbh. Again for scripts and small few pagers, it's as good as any template generator or dumny drag and drop tool.

            1 Reply Last reply
            0
            • pettter@social.accum.seP pettter@social.accum.se

              @Ruth_Mottram I have to admit I find his reasoning about energy use misleading at best - he has Claude running for at least around 10 minutes, and is implying that this is comparable in scope to a single ChatGPT query, which is listed as taking 0.3Wh, which is, uh, not comparable. @hausfath

              yvandasilva@hachyderm.ioY This user is from outside of this forum
              yvandasilva@hachyderm.ioY This user is from outside of this forum
              yvandasilva@hachyderm.io
              wrote last edited by
              #49

              @pettter @Ruth_Mottram @hausfath
              This is correct, the use of agents which is what allows to have sensible scripts that do what they are supposed to do rather than eyeballing it. Will generate hundreds if not thousands of queries for a very simple input.
              Since generally there will be more than one its not unexpected to product multiple thousands of queries via an agent to an LLMs. Its own "thinking mode" and tool triggering will also triggers more queries.
              All of that not even going into the "multi-agent" /"swarm of agents" territory.

              1 Reply Last reply
              0
              • arnebab@rollenspiel.socialA arnebab@rollenspiel.social

                @benjamingeer As far as I understand it, the task of @Ruth_Mottram was different from the two examples:

                - not trying to learn a skill
                - not something that’s complex to program, just a time sink (if I understand it correctly)

                And there is something in the text by @hausfath that I’ve also seen from others: a management role, detached from development.

                Like many scientists who do their data evaluation in Excel or sas GUIs (social sciences). And often don’t understand why it works.
                @UlrikeHahn

                S This user is from outside of this forum
                S This user is from outside of this forum
                slotos@toot.community
                wrote last edited by
                #50

                @ArneBab You skipped the most important point:

                - not intending for the result to be maintained

                For a one-off result these models seem impressive. Hell, outside of a „solve wages” bubble AI field consistently produces useful tools.

                But holy shit, can people that have never had to maintain a system after a 10x fuckface has fled the scene shut the fuck up about AI and coding? Code is the easy part where engineers get to finish the productivity reward loop. Go automate your vacations instead!

                1 Reply Last reply
                0
                • ruth_mottram@fediscience.orgR ruth_mottram@fediscience.org

                  I realise on the fediverse this is maybe asking for a flaming, but yesterday out of sheer curiosity I tried Claude for a simpleish coding task that I'd been putting off (largely inspired by @hausfath 's latest on #theclimatebrink). The performance of Claude was seriously impressive. I am convinced the AI cycle is more than hype (and have been for a while), the chatbots have been a huge attention hogger, misleadingly so, while the serious work has been done elsewhere. (We are developing ML tools to supplement parts of our climate model workflows).

                  Now I'm wondering if there is any serious EU competition to Anthropic? - Mistral's codestral perhaps?
                  Because this kind of performance changes everything and we can't afford to lag behind...
                  #AIcoding #ML

                  Edit: here is the climate brink post I mentioned

                  Link Preview Image
                  The AI-Augmented Scientist

                  The promise and pitfalls of using AI tools to boost my capabilities as a scientist

                  favicon

                  (www.theclimatebrink.com)

                  arnebab@rollenspiel.socialA This user is from outside of this forum
                  arnebab@rollenspiel.socialA This user is from outside of this forum
                  arnebab@rollenspiel.social
                  wrote last edited by
                  #51

                  @Ruth_Mottram I pondered for the past hour why this annoys me so much (because it does, even though I do see the individual arguments).

                  We spent more than a decade enabling scientists to cut lose from their matlab and office subscriptions that made scientific work dependent on regular payments, to enable them to do their work with matplotlib instead, and now many jump right back into a subscription service -- that uses matplotlib to make them dependent.

                  That adds insult to injury.
                  @hausfath

                  garonenur@rollenspiel.socialG 1 Reply Last reply
                  0
                  • arnebab@rollenspiel.socialA arnebab@rollenspiel.social

                    @Ruth_Mottram I pondered for the past hour why this annoys me so much (because it does, even though I do see the individual arguments).

                    We spent more than a decade enabling scientists to cut lose from their matlab and office subscriptions that made scientific work dependent on regular payments, to enable them to do their work with matplotlib instead, and now many jump right back into a subscription service -- that uses matplotlib to make them dependent.

                    That adds insult to injury.
                    @hausfath

                    garonenur@rollenspiel.socialG This user is from outside of this forum
                    garonenur@rollenspiel.socialG This user is from outside of this forum
                    garonenur@rollenspiel.social
                    wrote last edited by
                    #52

                    @ArneBab @Ruth_Mottram @hausfath yes this!

                    The models will never be open or free, like open Software and AI will be a huge factor to increase climate change even!
                    But the dependence and subscription service part should really be the deal breaker here.
                    Also: trust in the AI should be very low if it is provided by billionaire owned companies.

                    1 Reply Last reply
                    0
                    • ulrikehahn@fediscience.orgU ulrikehahn@fediscience.org

                      @benjamingeer @Ruth_Mottram @hausfath there are massive “productivity gains” everywhere:

                      - higher education is demonstrably being undermined by the fact that course work can now be (and is being) completed by AI query
                      - science is demonstrably buckling under a deluge of submissions
                      - democracy is demonstrably being harmed by AI based astroturfing

                      all of these are well-documented. “Controlled experiments” are not the only form of evidence ….

                      garonenur@rollenspiel.socialG This user is from outside of this forum
                      garonenur@rollenspiel.socialG This user is from outside of this forum
                      garonenur@rollenspiel.social
                      wrote last edited by
                      #53

                      @UlrikeHahn @benjamingeer @Ruth_Mottram @hausfath
                      This is such a depressing, but true, point I did not consider.
                      But now wonder if the heads of AI did, and like this productivity even.

                      1 Reply Last reply
                      0
                      • arnebab@rollenspiel.socialA arnebab@rollenspiel.social

                        @Padjo the core question is: for which tasks does this work reliably?

                        Did you review the code to ensure that it doesn’t have unintended side-effects?

                        (that’s the difference between having an auto-complete that works on abstract concepts and negligently releasing potentially dangerous products to the public)

                        ⇒ the fast part is only for the prototyping stage.
                        @Ruth_Mottram @hausfath

                        padjo@mastodon.ieP This user is from outside of this forum
                        padjo@mastodon.ieP This user is from outside of this forum
                        padjo@mastodon.ie
                        wrote last edited by
                        #54

                        @ArneBab @Ruth_Mottram @hausfath yes I reviewed the code. I worked with it to define the architecture and choose technologies. They are technologies I'm familiar with. The code is as good or better than I would write. It was far more thorough with edge cases. It handled error states better than i would have. I'm using it to build a new project, maybe it will reach a point where it is no longer helpful but I haven't seen any evidence of that. Software is just dramatically cheaper to produce now.

                        arnebab@rollenspiel.socialA 1 Reply Last reply
                        0
                        • padjo@mastodon.ieP padjo@mastodon.ie

                          @ArneBab @Ruth_Mottram @hausfath yes I reviewed the code. I worked with it to define the architecture and choose technologies. They are technologies I'm familiar with. The code is as good or better than I would write. It was far more thorough with edge cases. It handled error states better than i would have. I'm using it to build a new project, maybe it will reach a point where it is no longer helpful but I haven't seen any evidence of that. Software is just dramatically cheaper to produce now.

                          arnebab@rollenspiel.socialA This user is from outside of this forum
                          arnebab@rollenspiel.socialA This user is from outside of this forum
                          arnebab@rollenspiel.social
                          wrote last edited by
                          #55

                          @Padjo that’s interesting.

                          Thanks for the info.
                          @Ruth_Mottram @hausfath

                          1 Reply Last reply
                          0
                          • ruth_mottram@fediscience.orgR ruth_mottram@fediscience.org

                            I realise on the fediverse this is maybe asking for a flaming, but yesterday out of sheer curiosity I tried Claude for a simpleish coding task that I'd been putting off (largely inspired by @hausfath 's latest on #theclimatebrink). The performance of Claude was seriously impressive. I am convinced the AI cycle is more than hype (and have been for a while), the chatbots have been a huge attention hogger, misleadingly so, while the serious work has been done elsewhere. (We are developing ML tools to supplement parts of our climate model workflows).

                            Now I'm wondering if there is any serious EU competition to Anthropic? - Mistral's codestral perhaps?
                            Because this kind of performance changes everything and we can't afford to lag behind...
                            #AIcoding #ML

                            Edit: here is the climate brink post I mentioned

                            Link Preview Image
                            The AI-Augmented Scientist

                            The promise and pitfalls of using AI tools to boost my capabilities as a scientist

                            favicon

                            (www.theclimatebrink.com)

                            hopeless@mas.toH This user is from outside of this forum
                            hopeless@mas.toH This user is from outside of this forum
                            hopeless@mas.to
                            wrote last edited by
                            #56

                            @Ruth_Mottram @hausfath

                            Yes I think if you try the current SOTA stuff (like Google's Antigravity) on your own choice of code, own tasks, editing checkouts on your own machine, it's hard not to be impressed.

                            One thing to keep in mind is current LLMs are prone to sins of omission. I would strongly suggest never one-shotting anything and committing it.

                            If you simply ask it to audit what it just did from a security and completeness perspective, and fix what it found, you can get a big step up.

                            1 Reply Last reply
                            0
                            • R relay@relay.infosec.exchange shared this topic
                            Reply
                            • Reply as topic
                            Log in to reply
                            • Oldest to Newest
                            • Newest to Oldest
                            • Most Votes


                            • Login

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • World
                            • Users
                            • Groups