Had a lot of fun with my stats students today.

apophis@yourwalls.today

@futurebird i'm guessing the second one is made up because there aren't enough triples?

@Bumblefish

apophis@yourwalls.today

@futurebird @Bumblefish no, scratch, that, list A has a *lot* of triples, like a disturbing number, and there are so many ascending patterns...

bicebird@toot.wales

@futurebird little box, little box of horrors

dpiponi@mathstodon.xyz

@jedbrown @futurebird You described exactly what I would do. Obviously it would depend on an external PRNG and yes, no prompt. One natural way to use an LLM is to transform draws from a PRNG into draws from a distribution intended to represent some corpus. Picking numbers out of these draws would be expected to have a similar distribution to picking numbers from the original corpus. IIRC I may already have tested to see of the results conform to Benford's law - I did a lot of stuff like that when llama.cpp first became available. You have to select the right parameters to have llama.cpp use the distribution "correctly".

futurebird@sauropods.win

@ricko

This is the epistemological issue I have with the interface. It's ... well, not to be harsh but it's deceptive.

If you ask a "computer" for random numbers that has a kind of meaning, and expected process. If you ask a computer "how did you generate those random numbers?" that also has a set of expectations... and an LLM isn't meeting ANY of them.

alienghic@timeloop.cafe

@futurebird

The mean and standard deviations for both lists are about the same.

3.46 mean 1.7 stddev for listA
3.42 mean 1.69 stddev for listB

However for listA, the count how often the values appear are all 17 or 16 so it appears to be a uniform distribution, while for list B 3 shows up 24 times, and 4 and 5 are less frequent at 12 and 14 times respectively.

My conclusion is listA was generated from a uniform random distribution and listB was not.

I can't tell if listB was made by some other more advanced random distribution, but honestly it looks like someone took a uniform distribution and turned some of the 4s and 5s into 3s.

alienghic@timeloop.cafe

@dlakelan @futurebird

The dictionaries in the Counter() object are the number of times each integer appears.

In [18]: Counter(listA)
Out[18]: Counter(
{2: 17, 3: 17, 5: 16, 1: 17, 4: 17, 6: 16}
)

In [19]: Counter(listB)
Out[19]: Counter(
{4: 12, 2: 17, 5: 14, 6: 17, 3: 24, 1: 16}
)

danpmoore@mathstodon.xyz

@dlakelan @futurebird @Bumblefish Based on this description, A looks too uniform. B could be random.

zalasur@mastodon.surazal.net

@futurebird @Bumblefish Yes, you can determine probable likelihood. But given any list of items, it is impossible to prove or disprove whether a list is random or not.

sabrina@fedi01.unicornsparkle.club

@madjohnroberts @futurebird @Bumblefish

If List A has nearly equal occurrences of each number then that’s the one most likely to have been produced by the equivalent of rolling a die 100 times.

dlakelan@mastodon.sdf.org

@alienghic
I'm on my phone at a volleyball game but what's the likelihood for each (probability of seeing that vector of counts given a multinomial distribution with 1/6 as probability for each value)

should be pretty easy in R or Julia or Python though offhand I would need to look at docs for any of them. Julia would be something like
using Distributions
pdf(Multinomial([1/6, 1/6,...], [17,17,17,17,16,16])
@futurebird

koushiniku@hachyderm.io

@futurebird @Bumblefish
16 17

dlakelan@mastodon.sdf.org

@danpmoore
agreed, the frequencies seem too uniform for the first intuitively.
@futurebird @Bumblefish

charette@mstdn.ca

@futurebird Can you settle the question?

(My vote is the many 3x repeated sequences in listA is not random, but I'm not dedicated enough to pull out a die and record 100 rolls to see if that is likely to happen a bunch of times.)

madjohnroberts@mastodon.social

@sabrina I think the frequency being within floor/ciel of 100/6 and the first four being ciel(100/6) and last two floor(100/6) shows intentionality. I agree the frequency should be close but not exact! It's harder to say for certain though, 100 samples isn't so much and I think with a larger N the difference would be more apparent with listB showing less volatility
@futurebird @Bumblefish

futurebird@sauropods.win

ListA was created by making a list of 16 or 17 of each number. The Stdev **of the frequencies** is much lower than what you will find on random lists of similar size.

ListB was made by rolling dice.

futurebird@sauropods.win

@apophis @Bumblefish

I don't think the order should matter. The "problem" isn't related to the order of the list.

rubinlinux@mastodon.sdf.org

@futurebird Think of a chat with an LLM similar to a chat with a fellow (but maybe not so great) improv doing a skit. It is trying to play along with anything you give it. Always.

koushiniku@hachyderm.io

@futurebird I found out quickly that the entropy tools from NIST and Fourmilab don’t work well with a data set that’s log2(6) bits per element.

moira@mastodon.murkworks.net

@futurebird @Bumblefish Heh, this reminds me of something from school where... Evan? Somebody. made a plot of outputs from the system's (pseudo-)random number generator and turns out there some _very visible_ patterns. Like, obvious visible stripes in the number selection density plot.

#maths

CIRCLE WITH A DOT

Had a lot of fun with my stats students today.