Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. My lab mate, Jackson Dean, has been doing some really fun research into image generation.

My lab mate, Jackson Dean, has been doing some really fun research into image generation.

Scheduled Pinned Locked Moved Uncategorized
sciencegenerativeart
15 Posts 4 Posters 46 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • kevinrns@mstdn.socialK kevinrns@mstdn.social

    @ngaylinn

    Remember they are scraping alt text on Mastodon images, so poison your descriptions in weird ways humans see but AI can't, because it's stupid.

    Link Preview Image
    ngaylinn@tech.lgbtN This user is from outside of this forum
    ngaylinn@tech.lgbtN This user is from outside of this forum
    ngaylinn@tech.lgbt
    wrote last edited by
    #5

    @kevinrns Just don't poison alt text for humans! It serves an important purpose.

    kevinrns@mstdn.socialK 1 Reply Last reply
    0
    • ngaylinn@tech.lgbtN ngaylinn@tech.lgbt

      Jackson's research is great for exploring this, because we get to see abstract synthetic images that very strongly stimulate the AI to see... whatever it "wants" to see.

      Often, the results are recognizable. The image with oddly shaped pink blobs does sorta resemble flamingos. But there are also many examples where the AI fixates on some small detail of color or texture, and becomes convinced it's seeing something totally implausible.

      This is relates to "adversarial examples", another great way to see this.

      With real images, it seems like the AI "sees" like we do. But as soon as we venture beyond its training data, the illusion is broken, and it feels a bit like a parlor trick. Clearly AI doesn't see like we do.

      This is a great practice for AI generally: seek out the edge cases where the model fails. This breaks the spell of "general intelligence" and gives us a clearer idea of what's actually happening inside the black box.

      (3/3)
      #science #ai #generativeart

      lilacperegrine@clockwork.monsterL This user is from outside of this forum
      lilacperegrine@clockwork.monsterL This user is from outside of this forum
      lilacperegrine@clockwork.monster
      wrote last edited by
      #6

      @ngaylinn Are there images i can look at? there were some in the article but they appeared to be referencing previous work

      ngaylinn@tech.lgbtN 1 Reply Last reply
      0
      • lilacperegrine@clockwork.monsterL lilacperegrine@clockwork.monster

        @ngaylinn Are there images i can look at? there were some in the article but they appeared to be referencing previous work

        ngaylinn@tech.lgbtN This user is from outside of this forum
        ngaylinn@tech.lgbtN This user is from outside of this forum
        ngaylinn@tech.lgbt
        wrote last edited by
        #7

        @lilacperegrine Alas, this is still very early work in progress! I'll share this once Jackson does! If you look him up on Google scholar, though, you can see some of his other image generation projects, like this one: https://direct.mit.edu/isal/proceedings/isal2024/36/86/123507

        1 Reply Last reply
        0
        • ngaylinn@tech.lgbtN ngaylinn@tech.lgbt

          My lab mate, Jackson Dean, has been doing some really fun research into image generation.

          Unlike the common AI-generated images that mash together stolen artwork to make something sorta photo realistic, he's producing abstract art that's entirely novel. The general idea (inspired by innovation engines) is to generate an image from scratch, then ask a vision / language model what it sees. He generates lots of images with different descriptions, and refines those images to more closely resemble their descriptions.

          Not only is he making some really cool generative art, but he's learning something about what "novelty" is and how to produce it in a computer.

          Beyond that, though, I'm fascinated because it gives a window into the strange way computers "see" images.

          (1/3)
          #science #ai #generativeart

          fishface@ioc.exchangeF This user is from outside of this forum
          fishface@ioc.exchangeF This user is from outside of this forum
          fishface@ioc.exchange
          wrote last edited by
          #8

          @ngaylinn the approach of "generate an image, see what the CV thinks it is, and iterate" is, at that high level, just like how diffusion models work.

          ngaylinn@tech.lgbtN 1 Reply Last reply
          0
          • ngaylinn@tech.lgbtN ngaylinn@tech.lgbt

            @kevinrns Just don't poison alt text for humans! It serves an important purpose.

            kevinrns@mstdn.socialK This user is from outside of this forum
            kevinrns@mstdn.socialK This user is from outside of this forum
            kevinrns@mstdn.social
            wrote last edited by
            #9

            @ngaylinn

            But always poison.

            1 Reply Last reply
            0
            • fishface@ioc.exchangeF fishface@ioc.exchange

              @ngaylinn the approach of "generate an image, see what the CV thinks it is, and iterate" is, at that high level, just like how diffusion models work.

              ngaylinn@tech.lgbtN This user is from outside of this forum
              ngaylinn@tech.lgbtN This user is from outside of this forum
              ngaylinn@tech.lgbt
              wrote last edited by
              #10

              @FishFace That's true! What's different here, though, is that the generation procedure isn't attempting to sample from the distribution of "all natural images" learned from its training data. Instead, a CPPN is used to generate a "random" image with spatially coherent structure from scratch.

              This is nice, because it means the images are novel, not remixes of stolen data. Also, it allows us to explore the limitations of computer vision, since we're straying far from the distribution of images the model was trained on.

              ngaylinn@tech.lgbtN fishface@ioc.exchangeF 2 Replies Last reply
              0
              • ngaylinn@tech.lgbtN ngaylinn@tech.lgbt

                @FishFace That's true! What's different here, though, is that the generation procedure isn't attempting to sample from the distribution of "all natural images" learned from its training data. Instead, a CPPN is used to generate a "random" image with spatially coherent structure from scratch.

                This is nice, because it means the images are novel, not remixes of stolen data. Also, it allows us to explore the limitations of computer vision, since we're straying far from the distribution of images the model was trained on.

                ngaylinn@tech.lgbtN This user is from outside of this forum
                ngaylinn@tech.lgbtN This user is from outside of this forum
                ngaylinn@tech.lgbt
                wrote last edited by
                #11

                @FishFace Also, the model isn't guided towards any particular prompt. The prompts are discovered through random search, then used to refine those starting points.

                1 Reply Last reply
                0
                • ngaylinn@tech.lgbtN ngaylinn@tech.lgbt

                  @FishFace That's true! What's different here, though, is that the generation procedure isn't attempting to sample from the distribution of "all natural images" learned from its training data. Instead, a CPPN is used to generate a "random" image with spatially coherent structure from scratch.

                  This is nice, because it means the images are novel, not remixes of stolen data. Also, it allows us to explore the limitations of computer vision, since we're straying far from the distribution of images the model was trained on.

                  fishface@ioc.exchangeF This user is from outside of this forum
                  fishface@ioc.exchangeF This user is from outside of this forum
                  fishface@ioc.exchange
                  wrote last edited by
                  #12

                  @ngaylinn doesn't the CV model's training dataset have the same issues though? Whether the model has learnt "denoising" or "image to text", it still has to contain a hell of a lot of information about images, right?

                  ngaylinn@tech.lgbtN 1 Reply Last reply
                  0
                  • fishface@ioc.exchangeF fishface@ioc.exchange

                    @ngaylinn doesn't the CV model's training dataset have the same issues though? Whether the model has learnt "denoising" or "image to text", it still has to contain a hell of a lot of information about images, right?

                    ngaylinn@tech.lgbtN This user is from outside of this forum
                    ngaylinn@tech.lgbtN This user is from outside of this forum
                    ngaylinn@tech.lgbt
                    wrote last edited by
                    #13

                    @FishFace Yes, and it is a subtle difference. I wish I could share the images, since I think that would make it more apparent. ๐Ÿ™‚

                    In a diffusion model, you iteratively tweak an image of some static until the result is statistically similar to the images used in training.

                    In this experiment, you generate "random" images, but with the unique bias of CPPN networks, so they look more like "organic shapes" than static. You treat them like Rorschach tests, asking the CV model what it sees. Then, for each different answer, you iterate the image so the CV model is even more confident. Except, you aren't tweaking pixels to approach the target distribution, you're just giving hot / cold feedback to an evolutionary search.

                    The resulting images are far outside the distribution of the original dataset and look like abstract art, but still stimulate the CV model to be very confident about what it's seeing.

                    ngaylinn@tech.lgbtN fishface@ioc.exchangeF 2 Replies Last reply
                    0
                    • ngaylinn@tech.lgbtN ngaylinn@tech.lgbt

                      @FishFace Yes, and it is a subtle difference. I wish I could share the images, since I think that would make it more apparent. ๐Ÿ™‚

                      In a diffusion model, you iteratively tweak an image of some static until the result is statistically similar to the images used in training.

                      In this experiment, you generate "random" images, but with the unique bias of CPPN networks, so they look more like "organic shapes" than static. You treat them like Rorschach tests, asking the CV model what it sees. Then, for each different answer, you iterate the image so the CV model is even more confident. Except, you aren't tweaking pixels to approach the target distribution, you're just giving hot / cold feedback to an evolutionary search.

                      The resulting images are far outside the distribution of the original dataset and look like abstract art, but still stimulate the CV model to be very confident about what it's seeing.

                      ngaylinn@tech.lgbtN This user is from outside of this forum
                      ngaylinn@tech.lgbtN This user is from outside of this forum
                      ngaylinn@tech.lgbt
                      wrote last edited by
                      #14

                      @FishFace Another way of looking at this is a diffusion model is trying to make an image that resembles known images for a prompt. That's its loss function: minimize deviation from the target image distribution.

                      In this experiment, we're just asking "what would you call this thing?" without concern for how much it resembles other images with the same description. The fitness function is to get a confident response. You're evolving a Rorschach test where the model always sees a bird, even though it looks nothing like a picture of a bird.

                      1 Reply Last reply
                      0
                      • ngaylinn@tech.lgbtN ngaylinn@tech.lgbt

                        @FishFace Yes, and it is a subtle difference. I wish I could share the images, since I think that would make it more apparent. ๐Ÿ™‚

                        In a diffusion model, you iteratively tweak an image of some static until the result is statistically similar to the images used in training.

                        In this experiment, you generate "random" images, but with the unique bias of CPPN networks, so they look more like "organic shapes" than static. You treat them like Rorschach tests, asking the CV model what it sees. Then, for each different answer, you iterate the image so the CV model is even more confident. Except, you aren't tweaking pixels to approach the target distribution, you're just giving hot / cold feedback to an evolutionary search.

                        The resulting images are far outside the distribution of the original dataset and look like abstract art, but still stimulate the CV model to be very confident about what it's seeing.

                        fishface@ioc.exchangeF This user is from outside of this forum
                        fishface@ioc.exchangeF This user is from outside of this forum
                        fishface@ioc.exchange
                        wrote last edited by
                        #15

                        @ngaylinn ah thanks for the explanation

                        1 Reply Last reply
                        1
                        0
                        • R relay@relay.infosec.exchange shared this topic
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups