hm https://github.com/bluesky-social/social-app/blob/main/CLAUDE.md

airtower@woem.men

@benjamineskola@hachyderm.io @erincandescent@akko.erincandescent.net @res260@infosec.exchange @cwebber@social.coop Yeah, this. If I'm looking at a new tool/library to possibly use (and not do a hard fork on), a key question is: Are there people who understand and care maintaining this thing? Because if there aren't, eventually "hard fork" or "don't use it" will probably be my only choices.

And using LLMs to generate code points towards "no" (or at least "not much") for both understanding and caring. If someone skilled is actually putting in the effort to edit LLM output until it is no worse than what they would've written themselves (point for care at least), chances are it would've been faster (let alone other effects) to just do that.

erincandescent@akko.erincandescent.net

@benjamineskola @res260 @cwebber @airtower

Except there is a huge problem with people actually just not looking at the code being generated. The wave of slop PRs inundating many open-source projects recently, for example.
People keep saying ‘of course there is a human in the loop’ but it seems increasingly clear to me that nobody is actually bothering to be the human in the loop themselves.

I know these are problems, but you’re moving the topic of conversation. There have always been bad developers with bad practices shovling crappy code over the fence. LLMs have made this easier and it sucks but it’s not new.

And yes, this is a human problem, it’s all a human problem. But that’s like saying ‘guns don’t kill people, people do’. True, but, the tool clearly exacerbates the problem.

Sure, but lazy/careless people use tool to produce bad results is not a unique problem. It’s very easy with a power drill to make messy holes, but we arent’ forcing everyone to use hand drills.

Saying using these tools results in necessarily bad output is just not backed up by available evidence.

I don’t pretend they’re perfect and I don’t pretend there aren’t problems. What I sense is that they’re not going away and are going to become and remain routine parts of toolboxes long into the future.

benjamineskola@hachyderm.io

@erincandescent @res260 @cwebber @airtower > LLMs have made this easier and it sucks but it’s not new.

So why would we want to make it worse?

> Saying using these tools results in necessarily bad output is just not backed up by available evidence.

Every output I've seen from these things has been, at best, no better than a human would have done. And that's being generous.

> What I sense is that they’re not going away and are going to become and remain routine parts of toolboxes long into the future.

This is a self-fulfilling prophecy. Of course they won't go away if people insist on defending them.

erincandescent@akko.erincandescent.net

@airtower @res260 @benjamineskola @cwebber I still see LLM related artifacts as a negative quality signal. There's lots of crap LLM aided code out there and there's lots of people slopping stuff together. The worst developers are disproportionality interested.

But I think there's a lot of stuff being written with LLM assistance these days where you'd not be able to tell

airtower@woem.men

@erincandescent@akko.erincandescent.net @res260@infosec.exchange @benjamineskola@hachyderm.io @cwebber@social.coop That might be, but as I wrote in that case I doubt there's any benefit (like faster progress) to the developer (even looking at code only, ignoring all the harmful side effects of LLMs).

benjamineskola@hachyderm.io

@erincandescent @res260 @cwebber @airtower Given that every output of LLMs that I've seen that is identifiable as such has been mediocre at best, why would I assume without any evidence that there's a significant quantity of LLM-generated code that's actually good?

"There's no evidence of it but it's definitely there" is unpersuasive.

And I've also found that people's evaluations of LLM-generated code quality is wildly out of step with my own evaluations, so I would not automatically assume that because someone says it's good that it's actually good.

And then, even if the code was of acceptable quality, the negative effects on the process (increased difficult of reviewing decreased institutional knowledge, among other things) count against it too.

(And all of this is setting aside the ethical issues, which in practice I don't think we should do anyway. Like, even if LLMs produced good output they'd be ethically indefensible, and even if they were ethically acceptable the results are so poor that why would you bother with them?)

gnomon@mastodon.social

@joeyh @cwebber well that is mighty disappointing.

benjamineskola@hachyderm.io

@gnomon @joeyh @cwebber They were advertising some LLM stuff in the most recent release notes but it didn't sound like they were actually using it on Vim itself.

On the other hand, anyone who cares enough to advertise LLMs is probably using them too.

cwebber@social.coop

Also https://bsky.app/profile/pfrazee.com/post/3meogr22l3k2d

> A year ago, I thought LLMs were kind of neat but not that useful. I saw the code autocomplete and thought, meh.
>
>Last summer just flipped. I never ever thought I would see automated code generation like we see now.
>
> I know there’s baggage but you need to know the coders are being real about this

cwebber@social.coop

Also https://bsky.app/profile/samuel.fm/post/3mbz27d6qnc2v

eramdam@social.erambert.me

@cwebber yeah, least surprising Bluesky thing to do

fay@lingo.lol

@erincandescent
@res260 @cwebber we have tools for that tho? templates and libraries and bootstrapping and automation tools. they don't have to be, as @olivia so said a couple month ago "made from shit and blood"

erincandescent@akko.erincandescent.net

@fay @res260 @cwebber there's lots of code for which no reusable template could be created that is nonetheless rote

erincandescent@akko.erincandescent.net

@fay @cwebber @res260 is this a failure of ours? Perhaps. But it's the sum product of millions of small failures, not something easily corrected

theesm@social.tchncs.de

@cwebber oof, always wonder what review processes are in place at places where a large portion of changes have been vibecoded. I tend to discuss why a certain implementation has been picked, what pitfalls have been considered, if a solution is adequate to a problem etc. when reviewing major changes with engineers whose code I review, "claude did that" wouldn't really stand as an answer in my book. Makes me wonder if that kind of discussion just isn't common anymore? this whole agent stuff just seems like a big footgun to me that will inevitably lead to hard to comprehend and more difficult to navigate/to maintain codebases

fay@lingo.lol

@erincandescent @res260 @cwebber respectfully, that's not my experience If the task is specific enough to require human intervention, then it shouldn't be left to a stochastic code generator either

fay@lingo.lol

@erincandescent @res260 @cwebber with the hundreds of billions of dollars burned to make stochastic code generators that only sort of work (and at horrific ethical costs), i dare say that we could have developed adequate tooling instead, and we still can

erincandescent@akko.erincandescent.net

@fay @res260 @cwebber it's a coordination problem, not an effort problem. If everyone agreed to use the one serialisation format with the one schema language, with the one runtime environment, ...

fay@lingo.lol

@erincandescent @res260 @cwebber i mean a lot of people are agreeing to use the one llm right now, it seems

aslakr@mastodon.social

@cwebber Is there something funky going on? bsky seems to be down, at least in Europe.

And the post seems blocked by a labeler in blacksky https://blacksky.community/profile/did:plc:ragtjsm2j2vknwkz3zp4oxrd/post/3meogr22l3k2d

CIRCLE WITH A DOT

hm https://github.com/bluesky-social/social-app/blob/main/CLAUDE.md