The notion of a broken clock being sometimes right is based on a gross misunderstanding of what information is.

riley@toot.cat

@Smohc_Stahc If we made a hammer out of dynamite, would it be a hammer or dynamite?

emassey0135@caneandable.social

@riley @matt But information always has a probability value attached to it. For the broken clock, it is pretty much 0% likely that the time will be correct (1 in 12 times 60 = 1 in 720). But for the LLM, the probability could be 70% to 90% depending on what kind of information you are asking it for and how good the specific LLM is. Information becomes more useful as the probability of it being correct approaches 100%. A good reliable source would have a much higher probability of being correct and therefore be more useful, but the LLM is closer to that than to a broken clock at least for most things.

missconstrue@mefi.social

@riley Thats a very good question and you are so clever to think of it, I’d be happy to drill down on this topic for you.

Heh, sorry. Not a chatbot. Old philosopher, so...like a chatbot, only caffeine powered, argumentative and capable of consciousness. (Or at least, I would argue I’m conscious.) I honestly did believe it was a very illustrative analogy. Most people will parrot the clock paradigm; ie right twice a day, when you are correct that the underlying logic of the premise is faulty, and therefore any attempt to treat it as true will fail.

missconstrue@mefi.social

@riley @cptbutton I never really knew my root...

riley@toot.cat

@MissConstrue There's an interesting pattern to a large number of these faults, but I guess it'll be a topic for another day.

onekind@beige.party

@riley Riley, are you aware that linguistics in the 60s established language use conveys meaning by reference to other language with no guaranteed relation with some external reality? So all words bear the same relationship with reality a stopped clock has with actual time.

I mention this because LLMs are not designed to provide information about the world, they're designed to generate discourse — language use (its output) that is validly constructed by reference to other language use (its training dataset). It's not fair to judge an LLM on the basis it's a lousy search engine.

But if you spin up a RAG like NotebookLM and give it a reality to refer to (a set of documents) and then ask it a question i.e. is XYZ in the document set, turns out LLMs can do a pretty good job of accurately answering yes or no.

riley@toot.cat

@emassey0135 So it is with other commercial products. That's why there's rules specifying that berries for human consumption can't contain more than something like four aphids per a hundred grammes.

But who would buy jam with 30% aphid content? Even 10% aphid content, really?

@matt

vfrmedia@social.tchncs.de

@riley @MissConstrue

I was thinking of some equipment I saw at a "Telekom-Museum" in Germany - it contained a clock but wasn't always powered on (or was just a display piece)

The Germans had quite sensibly put a diagonal strip of red tape (in the style of the "Universal No" symbol) across the clock face, so you knew it was *not* a timepiece to be trusted..

riley@toot.cat

@vfrmedia In aviation, the process is standardised by way of the INOP stickers.

@MissConstrue

hypolite@friendica.mrpetovan.com

@samir @riley Why would you ever think of a computer as a human and how does it improve anything?

jonoleth@mastodon.social

@proedie @riley after obsessing a little over getting to the bottom of this, the answer seems to be that the historical origin (from 1711) is akin to "If you stop chasing trends you will sometimes be fashionable", which is more in line with riley's definition in the OP. The other "official" definitions I've found seem to follow this as well.

The definition that "coincidental correctness is worthless" seems to be a personal (though common) interpretation.

smohc_stahc@mastodon.gamedev.place

@riley This process turns dynamite into dynamite. The part is the whole.

However, the elevator is not the whole of the machine. It can be determined that the elevator tells time but which time is a mystery without the broken clocks. The elevator does not fix the clocks either, they are still broken.

missconstrue@mefi.social

@bdf2121cc3334b35b6ecda66e471 @riley
01001001 00100000 01110011 01100101 01100101 00100000 01111001 01101111 01110101

pedromj@mastodon.social

@onekind @riley The answer would still be fuzzy -- there would be a ratio of certainty associated to yes and no. Other methods like pattern search could be tuned to be completely certain on the yes or the no -- some even both -- but I think it is impossible to tune stochastic methods in the same way. To conclude, external data is needed to assess the correctness of the answer of an LLM.

samir@m.fedica.com

@hypolite @riley
A computer is not a human, but tools can replace humans to do certain job if not better.
If you don't like dishwashers, laundry machines, sewing machines, tractors and diggers then by all means hire someone to do it, but most of us find it more effective to use machines instead
I would rather focus my time on building more complex things than waste it on doing less complex jobs that a machine (or AI) can easily do in less time

onekind@beige.party

@pedromj @riley First, you're assuming that a RAG functions the same way as an LLM. It uses a mix of stochastic and deterministic analysis.

Second, a yes or no answer from a human is also 'fuzzy' in the sense that describing a query in language is never entirely precise, for exactly the reasons I discussed in my previous toot, so the answer given is always 'this is my best guess based on my contingent understanding of your imperfectly phrased question.'

Re your conclusion, I already described the document set as an artificially constructed external reality, which satisfies your objection.

demi@xeno.glyphpress.com

@riley
Yes, finally someone else gets it!

hypolite@friendica.mrpetovan.com

@samir Nobody ever told me to treat my dishwasher as an employee, though, why do you feel compelled to do this with LLM-based AI systems?

And if the benefits of these systems were that clear and on par with previously established machines, we wouldn't have this kind of conversation. The problem still isn't that people are using them wrong.

crapaud@mstdn.social

@riley
David Revoy recently mentioned how Pepper's (orange) cat Carrot was wrongly described as black by grokipedia. This made me speculate that it would be just as wrong if Carrot happened to be a black cat. Your post confirms that, thx.
https://framapiaf.org/@davidrevoy/115882389651946345

lordcaramac@discordian.social

@riley But what if I don't use the chatbot for information but as character in a game?

CIRCLE WITH A DOT

The notion of a broken clock being sometimes right is based on a gross misunderstanding of what information is.