Had a lot of fun with my stats students today.
-
@ai6yr @ohmu @futurebird wait so... is that the ultimate question? "What number will an LLM always include when generating random numbers?"
@meuwese @ohmu @futurebird Apparently humans have willed that into existence, yes. LOL. (err... Douglas Adams, precisely)
-
Only one of these lists could *plausibly* be from rolling dice.
@futurebird @ramsey @Bumblefish this is not remotely my area of expertise but I am interested in the answer. My guess would be that the list that looks more evenly distributed is the fake one, and therefore List A is the "actually random" one because it has more seemingly outlying subsets, like a whole bunch of 1s in rapid succession.
There are tons of ways to unevenly distribute but relatively few ways to evenly distribute, so the one that seems less even is more likely to be true
-
@futurebird @ramsey @Bumblefish this is not remotely my area of expertise but I am interested in the answer. My guess would be that the list that looks more evenly distributed is the fake one, and therefore List A is the "actually random" one because it has more seemingly outlying subsets, like a whole bunch of 1s in rapid succession.
There are tons of ways to unevenly distribute but relatively few ways to evenly distribute, so the one that seems less even is more likely to be true
@futurebird @ramsey @Bumblefish also I suspect maybe a Monty Hall kind of thing where you generated a bunch of random lists, and then selected the one that looked least random to you to trick your students.
I'd love to know what the actual answer is and what you were hoping to teach your students!
-
The LLM is like a little box of computer horrors that we peer into from time to time.
I'm sorry but the whole interface is just so silly.
You ask for random numbers with sentences and it pretends to give them to you? What are we doooooing?
"What are we doooooing?"
Well, we've taken the sound algorithm of a brabbling baby, supercharged by a huge library of words annotated by possibility of sequence and now management is jumping around like parents bragging what a genius their 11 month old is. All because WE try to find meaning in the perceived word sequence.
Same management that brags about 1400% lower prices :))
-
@futurebird It's very weird.
In principle, if you take an LLM, you should be able to get it to generate random numbers in a way that reflects the numbers that appear in the corpus it was trained on. If you have the raw model you can probably do that.
But if you ask ChatGPT (or at least if I do) it starts talking about how numbers taken from around us typically follow Benford's law so their first digits have a logarithmic distribution. When it then spits out some random numbers it's no longer sampling random numbers from the entire corpus but a sample that's probably heavily biased towards numbers that appear in articles about Benford's law. I.e. what people have previously said about these numbers, rather than the actual numbers.
Which in turn is what LLM do. They give an averaged output, not a reasoned.
In addition the inherent laws of measurement and control define that any reached output will never met the intended. Thus LLM output will never increase knowledge, but migrate toward zero.
-
There is something very creepy about the way LLMs willy cheerfully give lists of "random" numbers. But they aren't random in frequency, and as my students pointed out "it's probably from some webpage about how to generate random numbers"
But even then, why is the frequency so unnaturally regular? Is that an artifact from mixing lists of real random numbers together?
@futurebird
and how about those "random" passwords generated by AI
https://zeroes.ca/@kimcrawley/116099905667994600
* over and over, again. #PasswordReuse #VibeSlop -
@futurebird
and how about those "random" passwords generated by AI
https://zeroes.ca/@kimcrawley/116099905667994600
* over and over, again. #PasswordReuse #VibeSlopThis is what inspired the whole lesson. I had to show them this.
-
@futurebird @ramsey @Bumblefish also I suspect maybe a Monty Hall kind of thing where you generated a bunch of random lists, and then selected the one that looked least random to you to trick your students.
I'd love to know what the actual answer is and what you were hoping to teach your students!
I put the answer in the original thread with a CW. This was about frequency.
-
I've got some bad news. I've posted the solution with a CW on the original thread.
@futurebird @Bumblefish Yep, I read it… My bad. I used instinct, guts, not mathematics like the other answers. I should have

-
@futurebird I know how to find the SD and I will use the php-stats library every day of the week and twice on Sunday. I would much rather be able to depend on well supported community code. (At least until it is all replaced by ai slop)
I don't mind using libraries, but it's fun to write my own versions of things just so I know how they work.
When we make projects where we share code I encourage them to use libraries more often. I'm just a grumpy old lady about it sometimes.
-
E em0nm4stodon@infosec.exchange shared this topic