Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Had a lot of fun with my stats students today.

Had a lot of fun with my stats students today.

Scheduled Pinned Locked Moved Uncategorized
112 Posts 62 Posters 20 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • life_is@no-pony.farmL life_is@no-pony.farm
    @burnitdown@beige.party You need to put some NDO into the ram. @futurebird@sauropods.win
    burnitdown@beige.partyB This user is from outside of this forum
    burnitdown@beige.partyB This user is from outside of this forum
    burnitdown@beige.party
    wrote last edited by
    #15

    @Life_is @futurebird that's still the contents of RAM, whatever an NDO is.

    life_is@no-pony.farmL 1 Reply Last reply
    0
    • futurebird@sauropods.winF futurebird@sauropods.win

      Had a lot of fun with my stats students today. I gave them two data sets. One from a random number generator, the other was one I made up that was not random, but designed to look random. They were able to figure out which one was fake.

      Then we had ChatGPT make the same kind of data set (random numbers 1-6 set of 100) and it had the same problems as my fake set but in a different way.

      We talked about the study about AI generated passwords.

      geepawhill@mastodon.socialG This user is from outside of this forum
      geepawhill@mastodon.socialG This user is from outside of this forum
      geepawhill@mastodon.social
      wrote last edited by
      #16

      @futurebird As you so often do, you sent me off on a tangent. My favorite PRNG is in Knuth, and it's called Algorithm A there. It is entirely additive, so very fast, and has a period of 2^54.

      I spent *years* tryna to figure out why nobody ever used it or even mentioned it.

      Finally discovered that it has another name, and that it is quite frequently used today. 🙂

      I have, of course, completely forgotten its other name, which somebody here on fedi actually told me.

      1 Reply Last reply
      0
      • burnitdown@beige.partyB burnitdown@beige.party

        @futurebird

        FUN FACT: random ain't random. especially in computers.

        if you ask for "random" output from a computer, there is no guarantee that what comes out isn't actually from the contents of RAM.

        dpiponi@mathstodon.xyzD This user is from outside of this forum
        dpiponi@mathstodon.xyzD This user is from outside of this forum
        dpiponi@mathstodon.xyz
        wrote last edited by
        #17

        @burnitdown @futurebird These days if you really want random numbers you can have them. Eg. RDRAND on Intel chips is seeded by analogue circuitry, not by some state updated in RAM. And even if you don't use RDRAND directly its output is still used as a source of entropy for other generators.

        digitalcalibrator@hol.ogra.phD 1 Reply Last reply
        0
        • futurebird@sauropods.winF futurebird@sauropods.win

          There is something very creepy about the way LLMs willy cheerfully give lists of "random" numbers. But they aren't random in frequency, and as my students pointed out "it's probably from some webpage about how to generate random numbers"

          But even then, why is the frequency so unnaturally regular? Is that an artifact from mixing lists of real random numbers together?

          mcc@mastodon.socialM This user is from outside of this forum
          mcc@mastodon.socialM This user is from outside of this forum
          mcc@mastodon.social
          wrote last edited by
          #18

          @futurebird i mean the LLM itself is just a statistical distribution… the path through the distribution is i assume randomized, but the distribution itself is gonna be the same every time.

          1 Reply Last reply
          0
          • futurebird@sauropods.winF futurebird@sauropods.win

            There is something very creepy about the way LLMs willy cheerfully give lists of "random" numbers. But they aren't random in frequency, and as my students pointed out "it's probably from some webpage about how to generate random numbers"

            But even then, why is the frequency so unnaturally regular? Is that an artifact from mixing lists of real random numbers together?

            futurebird@sauropods.winF This user is from outside of this forum
            futurebird@sauropods.winF This user is from outside of this forum
            futurebird@sauropods.win
            wrote last edited by
            #19

            The LLM is like a little box of computer horrors that we peer into from time to time.

            I'm sorry but the whole interface is just so silly.

            You ask for random numbers with sentences and it pretends to give them to you? What are we doooooing?

            grumpasaurus@infosec.exchangeG dpiponi@mathstodon.xyzD perigee@rage.loveP gatesvp@mstdn.caG f_dion@mastodon.onlineF 13 Replies Last reply
            1
            0
            • futurebird@sauropods.winF futurebird@sauropods.win

              The LLM is like a little box of computer horrors that we peer into from time to time.

              I'm sorry but the whole interface is just so silly.

              You ask for random numbers with sentences and it pretends to give them to you? What are we doooooing?

              grumpasaurus@infosec.exchangeG This user is from outside of this forum
              grumpasaurus@infosec.exchangeG This user is from outside of this forum
              grumpasaurus@infosec.exchange
              wrote last edited by
              #20

              @futurebird it really puts into perspective what my interaction with real people is like

              1 Reply Last reply
              0
              • futurebird@sauropods.winF futurebird@sauropods.win

                The LLM is like a little box of computer horrors that we peer into from time to time.

                I'm sorry but the whole interface is just so silly.

                You ask for random numbers with sentences and it pretends to give them to you? What are we doooooing?

                dpiponi@mathstodon.xyzD This user is from outside of this forum
                dpiponi@mathstodon.xyzD This user is from outside of this forum
                dpiponi@mathstodon.xyz
                wrote last edited by
                #21

                @futurebird It's very weird.

                In principle, if you take an LLM, you should be able to get it to generate random numbers in a way that reflects the numbers that appear in the corpus it was trained on. If you have the raw model you can probably do that.

                But if you ask ChatGPT (or at least if I do) it starts talking about how numbers taken from around us typically follow Benford's law so their first digits have a logarithmic distribution. When it then spits out some random numbers it's no longer sampling random numbers from the entire corpus but a sample that's probably heavily biased towards numbers that appear in articles about Benford's law. I.e. what people have previously said about these numbers, rather than the actual numbers.

                jedbrown@hachyderm.ioJ raffzahn@mastodon.bayernR 2 Replies Last reply
                0
                • futurebird@sauropods.winF futurebird@sauropods.win

                  The LLM is like a little box of computer horrors that we peer into from time to time.

                  I'm sorry but the whole interface is just so silly.

                  You ask for random numbers with sentences and it pretends to give them to you? What are we doooooing?

                  perigee@rage.loveP This user is from outside of this forum
                  perigee@rage.loveP This user is from outside of this forum
                  perigee@rage.love
                  wrote last edited by
                  #22

                  @futurebird as others here have said or implied, I think LLMs are trained not to be random. Like as a structural part of the statistical models they're based on, so the input corpus will inform the "random" output.

                  Speaking as a long time not mathematically rigorous enough amateur cryptographer, most humans don't understand (not talking about you or your students, to be clear) that actually random can contain sequences and patterns, or parts of them, so when an uninformed human evaluates "randomness", they don't recognize sequences with patterns even if those are accidental coincidences.

                  Related, there's also the old cryptography parable that if a low ranking person in the security organization uses random picking to draw random numbers for, for example, a one time pad, the results won't really be random if that volunteer looks into the hat or drum from which they pick because they will subconsciously bias toward patterns like letter and number frequency from their experience and expectations, which might help an attacker decrypt the pad. Maybe.

                  Since the LLM is supposed to emulate human output it makes sense it might mess with "randomness".

                  1 Reply Last reply
                  0
                  • futurebird@sauropods.winF This user is from outside of this forum
                    futurebird@sauropods.winF This user is from outside of this forum
                    futurebird@sauropods.win
                    wrote last edited by
                    #23

                    @Bumblefish

                    Which one is random?
                    (data sets are 100 numbers 1 to 6)

                    listA=[2,3,5,1,2,2,4,2,4,5,2,3,3,4,5,6,4,2,6,2,2,1,3,4,5,5,6,3,3,6,1,4,2,1,4,5,2,2,3,3,3,5,6,3,2,4,5,5,1,1,1,6,1,4,3,5,5,3,1,1,1,6,1,4,6,6,3,6,6,2,4,4,4,5,1,5,6,2,6,1,1,2,4,2,2,3,4,4,5,6,1,3,3,3,5,4,6,5,1,6]

                    listB=[4,2,5,6,3,5,3,1,3,4,2,3,4,3,4,5,5,1,3,3,2,1,1,6,1,3,2,2,2,6,1,5,6,3,6,3,2,3,2,4,6,1,1,6,3,2,4,1,6,1,3,1,5,6,2,3,3,5,1,6,4,5,2,5,1,1,5,3,6,2,3,3,6,5,2,3,3,1,6,3,2,3,2,1,6,6,4,4,6,2,4,5,4,5,3,4,6,5,3,2]

                    noplasticshower@infosec.exchangeN zalasur@mastodon.surazal.netZ ramsey@phpc.socialR raederle@masto.nuR dlakelan@mastodon.sdf.orgD 19 Replies Last reply
                    0
                    • futurebird@sauropods.winF futurebird@sauropods.win

                      @Bumblefish

                      Which one is random?
                      (data sets are 100 numbers 1 to 6)

                      listA=[2,3,5,1,2,2,4,2,4,5,2,3,3,4,5,6,4,2,6,2,2,1,3,4,5,5,6,3,3,6,1,4,2,1,4,5,2,2,3,3,3,5,6,3,2,4,5,5,1,1,1,6,1,4,3,5,5,3,1,1,1,6,1,4,6,6,3,6,6,2,4,4,4,5,1,5,6,2,6,1,1,2,4,2,2,3,4,4,5,6,1,3,3,3,5,4,6,5,1,6]

                      listB=[4,2,5,6,3,5,3,1,3,4,2,3,4,3,4,5,5,1,3,3,2,1,1,6,1,3,2,2,2,6,1,5,6,3,6,3,2,3,2,4,6,1,1,6,3,2,4,1,6,1,3,1,5,6,2,3,3,5,1,6,4,5,2,5,1,1,5,3,6,2,3,3,6,5,2,3,3,1,6,3,2,3,2,1,6,6,4,4,6,2,4,5,4,5,3,4,6,5,3,2]

                      noplasticshower@infosec.exchangeN This user is from outside of this forum
                      noplasticshower@infosec.exchangeN This user is from outside of this forum
                      noplasticshower@infosec.exchange
                      wrote last edited by
                      #24

                      @futurebird @Bumblefish that question makes no sense

                      1 Reply Last reply
                      0
                      • futurebird@sauropods.winF futurebird@sauropods.win

                        The LLM is like a little box of computer horrors that we peer into from time to time.

                        I'm sorry but the whole interface is just so silly.

                        You ask for random numbers with sentences and it pretends to give them to you? What are we doooooing?

                        gatesvp@mstdn.caG This user is from outside of this forum
                        gatesvp@mstdn.caG This user is from outside of this forum
                        gatesvp@mstdn.ca
                        wrote last edited by
                        #25

                        @futurebird I am reminded of a Doctor Who episode, where they realize they are in a simulation because they are incapable of generating truly random numbers. One scene has a whole bunch of scientists sitting at a table and they all keep yelling the same number at the same time.

                        1 Reply Last reply
                        0
                        • dpiponi@mathstodon.xyzD dpiponi@mathstodon.xyz

                          @burnitdown @futurebird These days if you really want random numbers you can have them. Eg. RDRAND on Intel chips is seeded by analogue circuitry, not by some state updated in RAM. And even if you don't use RDRAND directly its output is still used as a source of entropy for other generators.

                          digitalcalibrator@hol.ogra.phD This user is from outside of this forum
                          digitalcalibrator@hol.ogra.phD This user is from outside of this forum
                          digitalcalibrator@hol.ogra.ph
                          wrote last edited by
                          #26

                          @dpiponi@mathstodon.xyz @burnitdown@beige.party @futurebird@sauropods.win and cloudflare famously uses a camera pointed at a wall of lava lamps because the motion is random

                          1 Reply Last reply
                          0
                          • futurebird@sauropods.winF futurebird@sauropods.win

                            @Bumblefish

                            Which one is random?
                            (data sets are 100 numbers 1 to 6)

                            listA=[2,3,5,1,2,2,4,2,4,5,2,3,3,4,5,6,4,2,6,2,2,1,3,4,5,5,6,3,3,6,1,4,2,1,4,5,2,2,3,3,3,5,6,3,2,4,5,5,1,1,1,6,1,4,3,5,5,3,1,1,1,6,1,4,6,6,3,6,6,2,4,4,4,5,1,5,6,2,6,1,1,2,4,2,2,3,4,4,5,6,1,3,3,3,5,4,6,5,1,6]

                            listB=[4,2,5,6,3,5,3,1,3,4,2,3,4,3,4,5,5,1,3,3,2,1,1,6,1,3,2,2,2,6,1,5,6,3,6,3,2,3,2,4,6,1,1,6,3,2,4,1,6,1,3,1,5,6,2,3,3,5,1,6,4,5,2,5,1,1,5,3,6,2,3,3,6,5,2,3,3,1,6,3,2,3,2,1,6,6,4,4,6,2,4,5,4,5,3,4,6,5,3,2]

                            zalasur@mastodon.surazal.netZ This user is from outside of this forum
                            zalasur@mastodon.surazal.netZ This user is from outside of this forum
                            zalasur@mastodon.surazal.net
                            wrote last edited by
                            #27

                            @futurebird @Bumblefish There's literally no way to say whether a list of numbers is random or not (1, 1, 1, 1, etc can plausibly be a random sequence for all we know), though you can establish likelihoods by looking at the distribution.

                            futurebird@sauropods.winF 1 Reply Last reply
                            0
                            • futurebird@sauropods.winF futurebird@sauropods.win

                              @Bumblefish

                              Which one is random?
                              (data sets are 100 numbers 1 to 6)

                              listA=[2,3,5,1,2,2,4,2,4,5,2,3,3,4,5,6,4,2,6,2,2,1,3,4,5,5,6,3,3,6,1,4,2,1,4,5,2,2,3,3,3,5,6,3,2,4,5,5,1,1,1,6,1,4,3,5,5,3,1,1,1,6,1,4,6,6,3,6,6,2,4,4,4,5,1,5,6,2,6,1,1,2,4,2,2,3,4,4,5,6,1,3,3,3,5,4,6,5,1,6]

                              listB=[4,2,5,6,3,5,3,1,3,4,2,3,4,3,4,5,5,1,3,3,2,1,1,6,1,3,2,2,2,6,1,5,6,3,6,3,2,3,2,4,6,1,1,6,3,2,4,1,6,1,3,1,5,6,2,3,3,5,1,6,4,5,2,5,1,1,5,3,6,2,3,3,6,5,2,3,3,1,6,3,2,3,2,1,6,6,4,4,6,2,4,5,4,5,3,4,6,5,3,2]

                              ramsey@phpc.socialR This user is from outside of this forum
                              ramsey@phpc.socialR This user is from outside of this forum
                              ramsey@phpc.social
                              wrote last edited by
                              #28

                              @futurebird @Bumblefish The only way you could determine that something’s not random is if a pattern emerges in the data set. Even still, statistically, it is probable for a CSPRNG with good entropy to produce a random data set that looks like it’s not random—unlikely, but probable.

                              futurebird@sauropods.winF ramsey@phpc.socialR 2 Replies Last reply
                              0
                              • dpiponi@mathstodon.xyzD dpiponi@mathstodon.xyz

                                @futurebird It's very weird.

                                In principle, if you take an LLM, you should be able to get it to generate random numbers in a way that reflects the numbers that appear in the corpus it was trained on. If you have the raw model you can probably do that.

                                But if you ask ChatGPT (or at least if I do) it starts talking about how numbers taken from around us typically follow Benford's law so their first digits have a logarithmic distribution. When it then spits out some random numbers it's no longer sampling random numbers from the entire corpus but a sample that's probably heavily biased towards numbers that appear in articles about Benford's law. I.e. what people have previously said about these numbers, rather than the actual numbers.

                                jedbrown@hachyderm.ioJ This user is from outside of this forum
                                jedbrown@hachyderm.ioJ This user is from outside of this forum
                                jedbrown@hachyderm.io
                                wrote last edited by
                                #29

                                @dpiponi Even with a raw model, I don't see how you would sample from the distribution of numbers in the corpus. Perhaps provide no context and sample one or more tokens (using an independent pseudo-random number generator) from the distribution, and if the returned token parses as a number, return it to the user, otherwise try again. Providing any context/prompt would bias what is returned. This seems too contrived/circular.
                                @futurebird

                                dpiponi@mathstodon.xyzD 1 Reply Last reply
                                0
                                • zalasur@mastodon.surazal.netZ zalasur@mastodon.surazal.net

                                  @futurebird @Bumblefish There's literally no way to say whether a list of numbers is random or not (1, 1, 1, 1, etc can plausibly be a random sequence for all we know), though you can establish likelihoods by looking at the distribution.

                                  futurebird@sauropods.winF This user is from outside of this forum
                                  futurebird@sauropods.winF This user is from outside of this forum
                                  futurebird@sauropods.win
                                  wrote last edited by
                                  #30

                                  @zalasur @Bumblefish

                                  You *can* make an argument for one of these lists being random like a dice roll and the other being much less likely to be generated in that way.

                                  zalasur@mastodon.surazal.netZ 1 Reply Last reply
                                  0
                                  • ramsey@phpc.socialR ramsey@phpc.social

                                    @futurebird @Bumblefish The only way you could determine that something’s not random is if a pattern emerges in the data set. Even still, statistically, it is probable for a CSPRNG with good entropy to produce a random data set that looks like it’s not random—unlikely, but probable.

                                    futurebird@sauropods.winF This user is from outside of this forum
                                    futurebird@sauropods.winF This user is from outside of this forum
                                    futurebird@sauropods.win
                                    wrote last edited by
                                    #31

                                    @ramsey @Bumblefish

                                    Only one of these lists could *plausibly* be from rolling dice.

                                    ramsey@phpc.socialR ldpm@wandering.shopL 2 Replies Last reply
                                    0
                                    • ramsey@phpc.socialR ramsey@phpc.social

                                      @futurebird @Bumblefish The only way you could determine that something’s not random is if a pattern emerges in the data set. Even still, statistically, it is probable for a CSPRNG with good entropy to produce a random data set that looks like it’s not random—unlikely, but probable.

                                      ramsey@phpc.socialR This user is from outside of this forum
                                      ramsey@phpc.socialR This user is from outside of this forum
                                      ramsey@phpc.social
                                      wrote last edited by
                                      #32

                                      @futurebird @Bumblefish I have a UUID-generating library that, under certain conditions, could generate the same identical UUIDs because the CSPRNG it used ended up reusing the same entropy seed, unless the server was restarted. That was a *fun* bug to investigate and fix. 😉

                                      1 Reply Last reply
                                      0
                                      • futurebird@sauropods.winF futurebird@sauropods.win

                                        @ramsey @Bumblefish

                                        Only one of these lists could *plausibly* be from rolling dice.

                                        ramsey@phpc.socialR This user is from outside of this forum
                                        ramsey@phpc.socialR This user is from outside of this forum
                                        ramsey@phpc.social
                                        wrote last edited by
                                        #33

                                        @futurebird @Bumblefish Based on the statistical distribution of the dice rolls?

                                        1 Reply Last reply
                                        0
                                        • futurebird@sauropods.winF futurebird@sauropods.win

                                          @Bumblefish

                                          Which one is random?
                                          (data sets are 100 numbers 1 to 6)

                                          listA=[2,3,5,1,2,2,4,2,4,5,2,3,3,4,5,6,4,2,6,2,2,1,3,4,5,5,6,3,3,6,1,4,2,1,4,5,2,2,3,3,3,5,6,3,2,4,5,5,1,1,1,6,1,4,3,5,5,3,1,1,1,6,1,4,6,6,3,6,6,2,4,4,4,5,1,5,6,2,6,1,1,2,4,2,2,3,4,4,5,6,1,3,3,3,5,4,6,5,1,6]

                                          listB=[4,2,5,6,3,5,3,1,3,4,2,3,4,3,4,5,5,1,3,3,2,1,1,6,1,3,2,2,2,6,1,5,6,3,6,3,2,3,2,4,6,1,1,6,3,2,4,1,6,1,3,1,5,6,2,3,3,5,1,6,4,5,2,5,1,1,5,3,6,2,3,3,6,5,2,3,3,1,6,3,2,3,2,1,6,6,4,4,6,2,4,5,4,5,3,4,6,5,3,2]

                                          raederle@masto.nuR This user is from outside of this forum
                                          raederle@masto.nuR This user is from outside of this forum
                                          raederle@masto.nu
                                          wrote last edited by
                                          #34

                                          @futurebird @Bumblefish I like list A for random and list B for “planned random”.

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups