Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Had a lot of fun with my stats students today.

Had a lot of fun with my stats students today.

Scheduled Pinned Locked Moved Uncategorized
112 Posts 62 Posters 20 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • futurebird@sauropods.winF futurebird@sauropods.win

    The LLM is like a little box of computer horrors that we peer into from time to time.

    I'm sorry but the whole interface is just so silly.

    You ask for random numbers with sentences and it pretends to give them to you? What are we doooooing?

    mastokarl@mastodon.socialM This user is from outside of this forum
    mastokarl@mastodon.socialM This user is from outside of this forum
    mastokarl@mastodon.social
    wrote last edited by
    #80

    @futurebird Well, LLMs are tools. Know their limitations. Know their power.

    In your case:

    "create 20 random numbers between 1 and 100 by developing a little python app and running it"

    Some day, AIs will respond to any prompt in a perfect way and we humans will be in deep shit.

    Edit: LOL mistral.ai answers this prompt by generating the random numbers and THEN SORTING THEM. 🤦‍♂️

    1 Reply Last reply
    0
    • futurebird@sauropods.winF futurebird@sauropods.win

      The LLM is like a little box of computer horrors that we peer into from time to time.

      I'm sorry but the whole interface is just so silly.

      You ask for random numbers with sentences and it pretends to give them to you? What are we doooooing?

      poleguy@mastodon.socialP This user is from outside of this forum
      poleguy@mastodon.socialP This user is from outside of this forum
      poleguy@mastodon.social
      wrote last edited by
      #81

      @futurebird The trouble is that people can accept that "factual" output from an LLM may be statistically generated until they hit words that are generated that sound like "reasoning." Then even the most aware humans can get lulled into thinking that the words can be trusted.

      1 Reply Last reply
      0
      • futurebird@sauropods.winF futurebird@sauropods.win

        "Why don't you just load a library to find the mean and SD?"

        Because I'M OLD. I like to write my own function. I do it for integration sometimes... kids these days.

        gkrnours@mastodon.gamedev.placeG This user is from outside of this forum
        gkrnours@mastodon.gamedev.placeG This user is from outside of this forum
        gkrnours@mastodon.gamedev.place
        wrote last edited by
        #82

        @futurebird I assume from this post someone already mentioned statistics from the python standard library?

        1 Reply Last reply
        0
        • futurebird@sauropods.winF futurebird@sauropods.win

          The LLM is like a little box of computer horrors that we peer into from time to time.

          I'm sorry but the whole interface is just so silly.

          You ask for random numbers with sentences and it pretends to give them to you? What are we doooooing?

          seachaint@masto.hackers.townS This user is from outside of this forum
          seachaint@masto.hackers.townS This user is from outside of this forum
          seachaint@masto.hackers.town
          wrote last edited by
          #83

          @futurebird there was a study that found that if you give an LLM some prompting to push it into a particular sampling-space (say, "bleeding heart leftie") and then ask it for some random numbers, you can then feed those numbers into another fresh instance and it'll drift towards the same sampling space.

          In other words, even the numerical distributions they sample from can be connected to the broader "noosphere" they're trained on, and that relation is a fucked sort of bijection

          seachaint@masto.hackers.townS 1 Reply Last reply
          0
          • seachaint@masto.hackers.townS seachaint@masto.hackers.town

            @futurebird there was a study that found that if you give an LLM some prompting to push it into a particular sampling-space (say, "bleeding heart leftie") and then ask it for some random numbers, you can then feed those numbers into another fresh instance and it'll drift towards the same sampling space.

            In other words, even the numerical distributions they sample from can be connected to the broader "noosphere" they're trained on, and that relation is a fucked sort of bijection

            seachaint@masto.hackers.townS This user is from outside of this forum
            seachaint@masto.hackers.townS This user is from outside of this forum
            seachaint@masto.hackers.town
            wrote last edited by
            #84

            @futurebird if you prompt it into "stats prof" or "crypto nerd" sampling space does it improve the quality of the fake RNG output?

            1 Reply Last reply
            0
            • futurebird@sauropods.winF futurebird@sauropods.win

              @Bumblefish

              Which one is random?
              (data sets are 100 numbers 1 to 6)

              listA=[2,3,5,1,2,2,4,2,4,5,2,3,3,4,5,6,4,2,6,2,2,1,3,4,5,5,6,3,3,6,1,4,2,1,4,5,2,2,3,3,3,5,6,3,2,4,5,5,1,1,1,6,1,4,3,5,5,3,1,1,1,6,1,4,6,6,3,6,6,2,4,4,4,5,1,5,6,2,6,1,1,2,4,2,2,3,4,4,5,6,1,3,3,3,5,4,6,5,1,6]

              listB=[4,2,5,6,3,5,3,1,3,4,2,3,4,3,4,5,5,1,3,3,2,1,1,6,1,3,2,2,2,6,1,5,6,3,6,3,2,3,2,4,6,1,1,6,3,2,4,1,6,1,3,1,5,6,2,3,3,5,1,6,4,5,2,5,1,1,5,3,6,2,3,3,6,5,2,3,3,1,6,3,2,3,2,1,6,6,4,4,6,2,4,5,4,5,3,4,6,5,3,2]

              david_chisnall@infosec.exchangeD This user is from outside of this forum
              david_chisnall@infosec.exchangeD This user is from outside of this forum
              david_chisnall@infosec.exchange
              wrote last edited by
              #85

              @futurebird @Bumblefish

              It’s a trick question. Neither list is random because 7 is the most random number and does not appear in either list. A six-sided die is not able to produce a 7 and cannot therefore produce a random number.

              - ChatGPT, probably.

              1 Reply Last reply
              0
              • futurebird@sauropods.winF futurebird@sauropods.win

                @Bumblefish

                Which one is random?
                (data sets are 100 numbers 1 to 6)

                listA=[2,3,5,1,2,2,4,2,4,5,2,3,3,4,5,6,4,2,6,2,2,1,3,4,5,5,6,3,3,6,1,4,2,1,4,5,2,2,3,3,3,5,6,3,2,4,5,5,1,1,1,6,1,4,3,5,5,3,1,1,1,6,1,4,6,6,3,6,6,2,4,4,4,5,1,5,6,2,6,1,1,2,4,2,2,3,4,4,5,6,1,3,3,3,5,4,6,5,1,6]

                listB=[4,2,5,6,3,5,3,1,3,4,2,3,4,3,4,5,5,1,3,3,2,1,1,6,1,3,2,2,2,6,1,5,6,3,6,3,2,3,2,4,6,1,1,6,3,2,4,1,6,1,3,1,5,6,2,3,3,5,1,6,4,5,2,5,1,1,5,3,6,2,3,3,6,5,2,3,3,1,6,3,2,3,2,1,6,6,4,4,6,2,4,5,4,5,3,4,6,5,3,2]

                tschfflr@fediscience.orgT This user is from outside of this forum
                tschfflr@fediscience.orgT This user is from outside of this forum
                tschfflr@fediscience.org
                wrote last edited by
                #86

                @futurebird @Bumblefish I vote for listB: I counted the times that two subsequent numbers are equal (1,1 or 4,4). In listA this occurs ~23 times so almost 1/4 of times, which seems too many (should be around 1/6). In listB it is ~9 times unless I missed some. Seems fewer than expected but anyway. If I’d spend more time I’d go for higher order ngrams

                1 Reply Last reply
                0
                • okohll@hachyderm.ioO okohll@hachyderm.io

                  @futurebird haven't tried it but maybe it's also all mixed up with non-random numbers in training content e.g. the next number after '20' is likely one of 0, 1 or 2, the start of a 21st century year so far. Or Benford's law https://en.wikipedia.org/wiki/Benford%27s_law

                  cstross@wandering.shopC This user is from outside of this forum
                  cstross@wandering.shopC This user is from outside of this forum
                  cstross@wandering.shop
                  wrote last edited by
                  #87

                  @okohll @futurebird I was about to suggest Benford's Law too!

                  okohll@hachyderm.ioO 1 Reply Last reply
                  0
                  • ai6yr@m.ai6yr.orgA ai6yr@m.ai6yr.org

                    @ohmu @futurebird LOL 42 and 73 are my picks for "random" numbers out of the LLMs, for now.

                    meuwese@mastodon.socialM This user is from outside of this forum
                    meuwese@mastodon.socialM This user is from outside of this forum
                    meuwese@mastodon.social
                    wrote last edited by
                    #88

                    @ai6yr @ohmu @futurebird wait so... is that the ultimate question? "What number will an LLM always include when generating random numbers?"

                    ai6yr@m.ai6yr.orgA 1 Reply Last reply
                    0
                    • burnitdown@beige.partyB burnitdown@beige.party

                      @Life_is @futurebird that's still the contents of RAM, whatever an NDO is.

                      life_is@no-pony.farmL This user is from outside of this forum
                      life_is@no-pony.farmL This user is from outside of this forum
                      life_is@no-pony.farm
                      wrote last edited by
                      #89
                      @burnitdown@beige.party @futurebird@sauropods.win raNDOm. A play on words.
                      1 Reply Last reply
                      0
                      • cstross@wandering.shopC cstross@wandering.shop

                        @okohll @futurebird I was about to suggest Benford's Law too!

                        okohll@hachyderm.ioO This user is from outside of this forum
                        okohll@hachyderm.ioO This user is from outside of this forum
                        okohll@hachyderm.io
                        wrote last edited by
                        #90

                        @cstross @futurebird God does play dice, but there’s a big lead weight in one side

                        1 Reply Last reply
                        0
                        • futurebird@sauropods.winF futurebird@sauropods.win

                          The LLM is like a little box of computer horrors that we peer into from time to time.

                          I'm sorry but the whole interface is just so silly.

                          You ask for random numbers with sentences and it pretends to give them to you? What are we doooooing?

                          thisalex@hachyderm.ioT This user is from outside of this forum
                          thisalex@hachyderm.ioT This user is from outside of this forum
                          thisalex@hachyderm.io
                          wrote last edited by
                          #91

                          @futurebird
                          > what are we doing?

                          I think that the best description is, that we take part in a play. LLM makes its best effort to write how this dialogue could continue to look plausible for the reader. Choose your own adventure.

                          1 Reply Last reply
                          0
                          • futurebird@sauropods.winF futurebird@sauropods.win

                            @Bumblefish

                            Which one is random?
                            (data sets are 100 numbers 1 to 6)

                            listA=[2,3,5,1,2,2,4,2,4,5,2,3,3,4,5,6,4,2,6,2,2,1,3,4,5,5,6,3,3,6,1,4,2,1,4,5,2,2,3,3,3,5,6,3,2,4,5,5,1,1,1,6,1,4,3,5,5,3,1,1,1,6,1,4,6,6,3,6,6,2,4,4,4,5,1,5,6,2,6,1,1,2,4,2,2,3,4,4,5,6,1,3,3,3,5,4,6,5,1,6]

                            listB=[4,2,5,6,3,5,3,1,3,4,2,3,4,3,4,5,5,1,3,3,2,1,1,6,1,3,2,2,2,6,1,5,6,3,6,3,2,3,2,4,6,1,1,6,3,2,4,1,6,1,3,1,5,6,2,3,3,5,1,6,4,5,2,5,1,1,5,3,6,2,3,3,6,5,2,3,3,1,6,3,2,3,2,1,6,6,4,4,6,2,4,5,4,5,3,4,6,5,3,2]

                            mildouze@mamot.frM This user is from outside of this forum
                            mildouze@mamot.frM This user is from outside of this forum
                            mildouze@mamot.fr
                            wrote last edited by
                            #92

                            @futurebird @Bumblefish
                            B
                            (Random answer) 🙂

                            1 Reply Last reply
                            0
                            • futurebird@sauropods.winF futurebird@sauropods.win

                              @Bumblefish

                              Which one is random?
                              (data sets are 100 numbers 1 to 6)

                              listA=[2,3,5,1,2,2,4,2,4,5,2,3,3,4,5,6,4,2,6,2,2,1,3,4,5,5,6,3,3,6,1,4,2,1,4,5,2,2,3,3,3,5,6,3,2,4,5,5,1,1,1,6,1,4,3,5,5,3,1,1,1,6,1,4,6,6,3,6,6,2,4,4,4,5,1,5,6,2,6,1,1,2,4,2,2,3,4,4,5,6,1,3,3,3,5,4,6,5,1,6]

                              listB=[4,2,5,6,3,5,3,1,3,4,2,3,4,3,4,5,5,1,3,3,2,1,1,6,1,3,2,2,2,6,1,5,6,3,6,3,2,3,2,4,6,1,1,6,3,2,4,1,6,1,3,1,5,6,2,3,3,5,1,6,4,5,2,5,1,1,5,3,6,2,3,3,6,5,2,3,3,1,6,3,2,3,2,1,6,6,4,4,6,2,4,5,4,5,3,4,6,5,3,2]

                              lamecarlate@pouet.itL This user is from outside of this forum
                              lamecarlate@pouet.itL This user is from outside of this forum
                              lamecarlate@pouet.it
                              wrote last edited by
                              #93

                              @futurebird @Bumblefish I'm no stats student, so maybe I haven't the bases (for lack of a better term, English is not my main language), but I think listA is the random one. The fact that in the listB there is nearly no triplets seems too good to be true.

                              futurebird@sauropods.winF 1 Reply Last reply
                              0
                              • abyssalrook@mstdn.socialA abyssalrook@mstdn.social

                                @futurebird Before I look at where the answer shows up, my guess would be that List A is random.

                                The odds of both dice being the same number when you roll 2 dice is 1/6 (36 possibilities, 6 desired results). For 3, that becomes 1/36. (6*6*6 possibilities, 6 desired).

                                What we have here is 98 consecutive possible places for a 3-of-a-kind to start. The odds that you would only draw the 1/36 chance ONCE (The 3 2's near the beginning of B) is something like....8%?

                                ingalovinde@embracing.spaceI This user is from outside of this forum
                                ingalovinde@embracing.spaceI This user is from outside of this forum
                                ingalovinde@embracing.space
                                wrote last edited by
                                #94

                                @AbyssalRook @futurebird I see two mistakes in your reasoning.
                                One is technical: events "numbers with position N, N+1 and N+2 are the same" for different values of N are _not_ independent of each other. (For example, if we know that this statement is true for N=10, then there likelihood of it being true for N=11 is 1/6, not 1/36.)
                                Another symbolizes a deeper problem with a lot of modern research that relies heavily on p-values: consider how many statements of this kind, containing the same amount of information, could you make? Unless you commit to a specific statement beforehand, before seeing the data: "this statement would only be true in 8% of cases for truly random data" does not really mean anything if it's just one out of 20 equally "interesting" statements one could make about the data (e.g. "how many triplets of incrementing numbers (modulo six) are there", "how many decrementing triplets are there", etc), each only 8% likely. Because of course it is expected that for most random sequences, a few of these individually not very likely statements will be true.

                                futurebird@sauropods.winF abyssalrook@mstdn.socialA 2 Replies Last reply
                                0
                                • lamecarlate@pouet.itL lamecarlate@pouet.it

                                  @futurebird @Bumblefish I'm no stats student, so maybe I haven't the bases (for lack of a better term, English is not my main language), but I think listA is the random one. The fact that in the listB there is nearly no triplets seems too good to be true.

                                  futurebird@sauropods.winF This user is from outside of this forum
                                  futurebird@sauropods.winF This user is from outside of this forum
                                  futurebird@sauropods.win
                                  wrote last edited by
                                  #95

                                  @lamecarlate @Bumblefish

                                  I've got some bad news. I've posted the solution with a CW on the original thread.

                                  lamecarlate@pouet.itL 1 Reply Last reply
                                  0
                                  • ingalovinde@embracing.spaceI ingalovinde@embracing.space

                                    @AbyssalRook @futurebird I see two mistakes in your reasoning.
                                    One is technical: events "numbers with position N, N+1 and N+2 are the same" for different values of N are _not_ independent of each other. (For example, if we know that this statement is true for N=10, then there likelihood of it being true for N=11 is 1/6, not 1/36.)
                                    Another symbolizes a deeper problem with a lot of modern research that relies heavily on p-values: consider how many statements of this kind, containing the same amount of information, could you make? Unless you commit to a specific statement beforehand, before seeing the data: "this statement would only be true in 8% of cases for truly random data" does not really mean anything if it's just one out of 20 equally "interesting" statements one could make about the data (e.g. "how many triplets of incrementing numbers (modulo six) are there", "how many decrementing triplets are there", etc), each only 8% likely. Because of course it is expected that for most random sequences, a few of these individually not very likely statements will be true.

                                    futurebird@sauropods.winF This user is from outside of this forum
                                    futurebird@sauropods.winF This user is from outside of this forum
                                    futurebird@sauropods.win
                                    wrote last edited by
                                    #96

                                    @IngaLovinde @AbyssalRook

                                    It's been really helpful for me to see how many people focused on the order of the numbers in the list, which I didn't think very important since the list is so short that that type of analysis might not be that useful.

                                    I used the random list to scramble the fake numbers twice. I should have scrambled them more.

                                    1 Reply Last reply
                                    0
                                    • ingalovinde@embracing.spaceI ingalovinde@embracing.space

                                      @AbyssalRook @futurebird I see two mistakes in your reasoning.
                                      One is technical: events "numbers with position N, N+1 and N+2 are the same" for different values of N are _not_ independent of each other. (For example, if we know that this statement is true for N=10, then there likelihood of it being true for N=11 is 1/6, not 1/36.)
                                      Another symbolizes a deeper problem with a lot of modern research that relies heavily on p-values: consider how many statements of this kind, containing the same amount of information, could you make? Unless you commit to a specific statement beforehand, before seeing the data: "this statement would only be true in 8% of cases for truly random data" does not really mean anything if it's just one out of 20 equally "interesting" statements one could make about the data (e.g. "how many triplets of incrementing numbers (modulo six) are there", "how many decrementing triplets are there", etc), each only 8% likely. Because of course it is expected that for most random sequences, a few of these individually not very likely statements will be true.

                                      abyssalrook@mstdn.socialA This user is from outside of this forum
                                      abyssalrook@mstdn.socialA This user is from outside of this forum
                                      abyssalrook@mstdn.social
                                      wrote last edited by
                                      #97

                                      @IngaLovinde I'm not following the first problem in the logic. The situation you're describing might be important if we're looking at more and more instances of it happening, but looking at it happening at least once (~94%) doesn't change at all, and it happening ONLY once might jiggle the ~8% estimate I had, but not significantly move it.

                                      abyssalrook@mstdn.socialA ingalovinde@embracing.spaceI 2 Replies Last reply
                                      0
                                      • flockofcats@famichiki.jpF This user is from outside of this forum
                                        flockofcats@famichiki.jpF This user is from outside of this forum
                                        flockofcats@famichiki.jp
                                        wrote last edited by
                                        #98

                                        @Bumblefish @futurebird
                                        That was an interesting thread. Our brains are wired to think certain things are “random” when they’re not, so when people try to create something that looks random, they often avoid repeated numbers, even though there’d be repeats, if truly random, with some expected frequency. Also, odd numbers are often overrepresented cuz they feel more random, e.g., 5973 vs 6084. This “ looks random, but isn’t” often comes up when people fabricate scientific data 🤓

                                        1 Reply Last reply
                                        0
                                        • abyssalrook@mstdn.socialA abyssalrook@mstdn.social

                                          @IngaLovinde I'm not following the first problem in the logic. The situation you're describing might be important if we're looking at more and more instances of it happening, but looking at it happening at least once (~94%) doesn't change at all, and it happening ONLY once might jiggle the ~8% estimate I had, but not significantly move it.

                                          abyssalrook@mstdn.socialA This user is from outside of this forum
                                          abyssalrook@mstdn.socialA This user is from outside of this forum
                                          abyssalrook@mstdn.social
                                          wrote last edited by
                                          #99

                                          @IngaLovinde As for the latter, that is entirely true from a research perspective, but I picked the 3-of-a-kind pattern because I assumed the non-random list was entirely human constructed, and that particular pattern is one that sticks out to us the most. Someone making a list by hand is more likely to see "6-6-6" as less random than "6-1-2" or "3-4-5".

                                          I did not clock 'Which is random?' as one being a dice roll and the other being a shuffled deck of prescribed cards.

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups