I wish I could recommend this piece more, because it makes a bunch of great points, but the "normal technology" case feels misleading to me.
-
@glyph I like your breakdown in those articles.
I think that some of the more valuable stuff has been not when juniors prompt and don’t get value, but when seniors prompt, go do something else for a bit while the machine churns for a couple of minutes, and then come back to something that is pretty close to a good solution.
Think about a thing that might take you 15 minutes to kinda menially do (add some CLI bo flag that then needs to get passed down 3 layers in some spot for example)
@glyph lowering of activation energy is how I see that. And while I agree that the futzing is way undercounted (and that, IMO, a lot of this falls over in longer sessions and is just not worth it)… a strong dev who knows exactly what the solution is supposed to look like can get paper cut-y stuff cleaned up. A lot.
The “whine on slack about a thing being busted” turns into a fix, and most of that you can just go get a cup of water or review something in the meantime. Cool party trick at least
-
@glyph lowering of activation energy is how I see that. And while I agree that the futzing is way undercounted (and that, IMO, a lot of this falls over in longer sessions and is just not worth it)… a strong dev who knows exactly what the solution is supposed to look like can get paper cut-y stuff cleaned up. A lot.
The “whine on slack about a thing being busted” turns into a fix, and most of that you can just go get a cup of water or review something in the meantime. Cool party trick at least
@glyph totally to your point tho… the party trick might just be that. It feels fun to have progress happen when laundry is being folded but in the end I might end up churning anyways
-
@sabik uh I think that’s the METR one? IIRC not the best methodology but it’s still a kinda interesting result and well worth pursuing further https://arxiv.org/abs/2507.09089
@glyph
Thanks, that's the one! -
I don't want to be a catastrophist but every day I am politely asking "this seems like it might be incredibly toxic brain poison. I don't think I want to use something that could be a brain poison. could you show me some data that indicates it's safe?" And this request is ignored. No study has come out showing it *IS* a brain poison, but there are definitely a few that show it might be, and nothing in the way of a *successful* safety test.
@glyph while I am not aware of any study showing the poisonous character of LLMs, two items are already proven:
1. LLMs have a more detrimental effect on software development than they have benefits. Google's DORA report showed now multiple years in a row, that LLM use in SW dev decreases performance and outcomes in most teams.
2. Abuse for malicious intent is rampant, yielding scary propaganda, misinformation, distraction campaigns and intensifies the threat from social engineering attacks -
@svines you obviously know your role and your relationship to your org better than I do :). but this COULD be pitched in a very non-career-suicidal way, i.e.: “hey boss I love the great-great-grandboss’s AI mandate but wouldn’t it be so cool if we had some actual DATA to show how productive it is making our team? I found this formula online…”
@glyph yeah true. I am in charge of setting OKRs for my team so productivity etc is part of that. Another guerilla tactic I thought about was asking our legal team what their thoughts on ai-generated code are now that the US supreme court have refused to hear an appeal to "AI code can't be copyrighted" - that potentially means our company no longer has protection given how much vibe coded stuff is around now
-
@glyph while I am not aware of any study showing the poisonous character of LLMs, two items are already proven:
1. LLMs have a more detrimental effect on software development than they have benefits. Google's DORA report showed now multiple years in a row, that LLM use in SW dev decreases performance and outcomes in most teams.
2. Abuse for malicious intent is rampant, yielding scary propaganda, misinformation, distraction campaigns and intensifies the threat from social engineering attacks@nils_berger have you got a link for that report?
-
@glyph lowering of activation energy is how I see that. And while I agree that the futzing is way undercounted (and that, IMO, a lot of this falls over in longer sessions and is just not worth it)… a strong dev who knows exactly what the solution is supposed to look like can get paper cut-y stuff cleaned up. A lot.
The “whine on slack about a thing being busted” turns into a fix, and most of that you can just go get a cup of water or review something in the meantime. Cool party trick at least
@raphael Believe me, I understand the appeal of the hit of dopamine to get moving when one is stuck. I really want a tool that can do that for me, but I would like to know what other effects it has, and whether it's going to be a net detriment.
-
@glyph yeah true. I am in charge of setting OKRs for my team so productivity etc is part of that. Another guerilla tactic I thought about was asking our legal team what their thoughts on ai-generated code are now that the US supreme court have refused to hear an appeal to "AI code can't be copyrighted" - that potentially means our company no longer has protection given how much vibe coded stuff is around now
@svines oh yeah you definitely won't be able to copyright anything vibe-coded, the outputs are flatly not copyrightable right now in the US. not clear that will actually make a difference given the work-as-a-whole probably is still pretty defensible for a while, but as a way to start putting more bricks in the wall, it's definitely worth raising concerns
-
2. If it is "nuts" to dismiss this experience, then it would be "nuts" to dismiss mine: I have seen many, many high profile people in tech, who I have respect for, take *absolutely unhinged* risks with LLM technology that they have never, in decades-long careers, taken with any other tool or technology. It reads like a kind of cognitive decline. It's scary. And many of these people are *leaders* who use their influence to steamroll objections to these tools because they're "obviously" so good
@glyph THIS. This is what confuses me the most, I know software devs that all their life have been very risk averse, embracing LLM coding tools. It's something I cannot understand.
-
2. If it is "nuts" to dismiss this experience, then it would be "nuts" to dismiss mine: I have seen many, many high profile people in tech, who I have respect for, take *absolutely unhinged* risks with LLM technology that they have never, in decades-long careers, taken with any other tool or technology. It reads like a kind of cognitive decline. It's scary. And many of these people are *leaders* who use their influence to steamroll objections to these tools because they're "obviously" so good
@glyph so, where does AI stand on the inventory of cult-like behavior?
Because what you are describing sounds a lot like a cult.
And if you automate the love bombing and the extraction of secrets and instilling or distilling of mission...
Ah, fuck.
-
-
2. If it is "nuts" to dismiss this experience, then it would be "nuts" to dismiss mine: I have seen many, many high profile people in tech, who I have respect for, take *absolutely unhinged* risks with LLM technology that they have never, in decades-long careers, taken with any other tool or technology. It reads like a kind of cognitive decline. It's scary. And many of these people are *leaders* who use their influence to steamroll objections to these tools because they're "obviously" so good
@glyph Something that has gotten under my skin for the past year or so is seeing code changes like: large refactors, porting a legacy tool to rust, even minor bugfixes - things that would be a struggle to push through the inertia of code review - get fast tracked when "the AI did it." Like the exact PRs I've written and tried to advocate before and eventually gave up on. The changes and their risks are the same, I can only conclude that the bar is lower for accepting "AI" contributions.
-
@MrBerard @glyph (poverty of speech, flat affect, disorganized speech/though, delusions, reduced attention, brain fog, disorientation, confusion, etc. all being pretty common psychosis features - and all coming in various degrees, many of which LLM folks seem to exhibit to various degrees pretty commonly.)
Agreed. But it's the subtle influence on user's views I'm referring to. Which was a social media problem before it was an AI issue.
Sure, we can categorise this as "delusions", but I don't know that bundling everything as 'psychosis' helps the debate, in that it flattens the nuances between subtle and overt cases.
Ultimately, we're tying to apply a medical model designed before mass media , DSM updates notwithstanding. Not surprising it reaches the limits of its utility.
-
@mcc He thinks the technology is capable of many horrors but it can also be useful for pedestrian things.
What I've observed very recently is that even intelligent people, experienced developers - who know perfectly well that LLMs are just generators of text from statistical models of what someone is likely to write - will still pull up AI written search results and proceed on the automatic assumption that whatever they say is correct.
That is not a general observation. That was this morning, with some senior programmers trying to solve a problem that's prolonging a code freeze.
-
What I've observed very recently is that even intelligent people, experienced developers - who know perfectly well that LLMs are just generators of text from statistical models of what someone is likely to write - will still pull up AI written search results and proceed on the automatic assumption that whatever they say is correct.
That is not a general observation. That was this morning, with some senior programmers trying to solve a problem that's prolonging a code freeze.
-
For me, this is the body horror money quote from that Scientific American article:
"participants who saw the AI autocomplete prompts reported attitudes that were more in line with the AI’s position—including people who didn’t use the AI’s suggested text at all"
So maybe you can't use it "responsibly", or "safely". You can't even ignore it and choose not to use it once you've seen it.
If you can see it, the basilisk has already won.
@glyph i can absolutely use it responsibly because i'm not new to NLP, but unfortunately it is liquified shite.
-
@glyph i can absolutely use it responsibly because i'm not new to NLP, but unfortunately it is liquified shite.
@glyph oh btw, have coded stuff with Twisted a long time ago, was in fact my introduction to async callback oriented programming. so using this opportunity to say thank you for teaching me the reactor pattern!
-
For me, this is the body horror money quote from that Scientific American article:
"participants who saw the AI autocomplete prompts reported attitudes that were more in line with the AI’s position—including people who didn’t use the AI’s suggested text at all"
So maybe you can't use it "responsibly", or "safely". You can't even ignore it and choose not to use it once you've seen it.
If you can see it, the basilisk has already won.
@glyph when teams autocorrect rewrites something it decides i misspelled, i am filled with hatred and disgust and usually delete the entire sentence and try again regardless of if it had suggested the word i meant to write. i don't want it anymore
-
@glyph when teams autocorrect rewrites something it decides i misspelled, i am filled with hatred and disgust and usually delete the entire sentence and try again regardless of if it had suggested the word i meant to write. i don't want it anymore
@glyph this is how i avoid getting early onset dementia from being exposed to involuntary slop
-
@glyph I like your breakdown in those articles.
I think that some of the more valuable stuff has been not when juniors prompt and don’t get value, but when seniors prompt, go do something else for a bit while the machine churns for a couple of minutes, and then come back to something that is pretty close to a good solution.
Think about a thing that might take you 15 minutes to kinda menially do (add some CLI bo flag that then needs to get passed down 3 layers in some spot for example)
@raphael @glyph The thing that the LLM is getting you to not think about is that it shouldn't take passing things down three layers (much less more, which is more common). This is the boilerplate that everyone hates and the goal should be to remove the need for it at all, not produce more faster.
"The least worst way to use an LLM is to do something you already know how to do", now with the addendum that we don't know what we don't know.