Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. There will never be an AI tool that is truly private unless it hasn't trained on nonconsensual data.

There will never be an AI tool that is truly private unless it hasn't trained on nonconsensual data.

Scheduled Pinned Locked Moved Uncategorized
privacyconsenthumanrightsnoai
19 Posts 12 Posters 6 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

    There will never be an AI tool that
    is truly private unless it hasn't trained on nonconsensual data.

    Even if a platform were able to
    create the perfect protections for its users' prompts and results,

    If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

    How could it be?

    The company is saying:
    "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

    Users thinking they are using a privacy-respectful platform are in fact saying:

    "Privacy for me and not for thee,"

    And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

    Always ask: Where the training data comes from?

    Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

    #Privacy #AI #Consent #HumanRights #NoAI

    gbargoud@masto.nycG This user is from outside of this forum
    gbargoud@masto.nycG This user is from outside of this forum
    gbargoud@masto.nyc
    wrote on last edited by
    #2

    RE: https://masto.nyc/@gbargoud/115822346288522227

    @Em0nM4stodon

    I hope they take this issue up again in the next session where they might vote on it

    1 Reply Last reply
    0
    • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

      There will never be an AI tool that
      is truly private unless it hasn't trained on nonconsensual data.

      Even if a platform were able to
      create the perfect protections for its users' prompts and results,

      If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

      How could it be?

      The company is saying:
      "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

      Users thinking they are using a privacy-respectful platform are in fact saying:

      "Privacy for me and not for thee,"

      And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

      Always ask: Where the training data comes from?

      Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

      #Privacy #AI #Consent #HumanRights #NoAI

      A This user is from outside of this forum
      A This user is from outside of this forum
      awalter@mastodon.bawue.social
      wrote on last edited by
      #3

      @Em0nM4stodon What if you run a Large Language Modell local on your device?

      em0nm4stodon@infosec.exchangeE 1 Reply Last reply
      0
      • A awalter@mastodon.bawue.social

        @Em0nM4stodon What if you run a Large Language Modell local on your device?

        em0nm4stodon@infosec.exchangeE This user is from outside of this forum
        em0nm4stodon@infosec.exchangeE This user is from outside of this forum
        em0nm4stodon@infosec.exchange
        wrote on last edited by
        #4

        @awalter Where does the data to train the LLM initially come from?

        phil@fed.bajsicki.comP 1 Reply Last reply
        0
        • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

          There will never be an AI tool that
          is truly private unless it hasn't trained on nonconsensual data.

          Even if a platform were able to
          create the perfect protections for its users' prompts and results,

          If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

          How could it be?

          The company is saying:
          "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

          Users thinking they are using a privacy-respectful platform are in fact saying:

          "Privacy for me and not for thee,"

          And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

          Always ask: Where the training data comes from?

          Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

          #Privacy #AI #Consent #HumanRights #NoAI

          watchfulcitizen@goingdark.socialW This user is from outside of this forum
          watchfulcitizen@goingdark.socialW This user is from outside of this forum
          watchfulcitizen@goingdark.social
          wrote on last edited by
          #5

          @Em0nM4stodon well said! An interesting thought however. What's considered ethical scraping? All public data? No scraping at all?

          Respect robots.txt?

          I fully agree with you. Another issue is the lack of transparency from those who train. Its very unknown what data has been used or where it came from.

          I'm not saying we shouldn't invest in AI. But the current form isnt ethical.

          tmw@ioc.exchangeT viq@social.hackerspace.plV guillaumerossolini@infosec.exchangeG 3 Replies Last reply
          0
          • watchfulcitizen@goingdark.socialW watchfulcitizen@goingdark.social

            @Em0nM4stodon well said! An interesting thought however. What's considered ethical scraping? All public data? No scraping at all?

            Respect robots.txt?

            I fully agree with you. Another issue is the lack of transparency from those who train. Its very unknown what data has been used or where it came from.

            I'm not saying we shouldn't invest in AI. But the current form isnt ethical.

            tmw@ioc.exchangeT This user is from outside of this forum
            tmw@ioc.exchangeT This user is from outside of this forum
            tmw@ioc.exchange
            wrote on last edited by
            #6

            @watchfulcitizen have you ever created a robots.txt? when you did it, were you thinking it granted anybody a license to steal and regurgitate your content in the form of untraceable homunculus bullshit, in many cases profiting from it?

            and i'll say it for you: i don't think we should invest in AI until we can figure out what the hell is going on

            watchfulcitizen@goingdark.socialW 1 Reply Last reply
            0
            • watchfulcitizen@goingdark.socialW watchfulcitizen@goingdark.social

              @Em0nM4stodon well said! An interesting thought however. What's considered ethical scraping? All public data? No scraping at all?

              Respect robots.txt?

              I fully agree with you. Another issue is the lack of transparency from those who train. Its very unknown what data has been used or where it came from.

              I'm not saying we shouldn't invest in AI. But the current form isnt ethical.

              viq@social.hackerspace.plV This user is from outside of this forum
              viq@social.hackerspace.plV This user is from outside of this forum
              viq@social.hackerspace.pl
              wrote on last edited by
              #7

              @watchfulcitizen
              "AI" is currently a useless marketing term, lumping together very different technologies, with very different properties, and implying that just because one of them is useful for a thing, so are the LLMs that everyone lost their minds about. And in either case, nowhere is there any Intelligence to be found.
              So, right now I very much AM saying, that we should NOT invest in "AI".
              @Em0nM4stodon

              watchfulcitizen@goingdark.socialW 1 Reply Last reply
              0
              • watchfulcitizen@goingdark.socialW watchfulcitizen@goingdark.social

                @Em0nM4stodon well said! An interesting thought however. What's considered ethical scraping? All public data? No scraping at all?

                Respect robots.txt?

                I fully agree with you. Another issue is the lack of transparency from those who train. Its very unknown what data has been used or where it came from.

                I'm not saying we shouldn't invest in AI. But the current form isnt ethical.

                guillaumerossolini@infosec.exchangeG This user is from outside of this forum
                guillaumerossolini@infosec.exchangeG This user is from outside of this forum
                guillaumerossolini@infosec.exchange
                wrote on last edited by
                #8

                @watchfulcitizen that’s easy, it’s scraping of sources that have pre approved this use, and all the big ones have this kind of agreement (often for a fee)

                But of course can you trust them to include only data that was contributed with the same agreement, that’s tougher

                I’m thinking about the crowdsourced Japanese translation of this Mozilla thing, can’t remember the details, they bailed recently and withdrew their contributions when it comes to LLM fair use

                @Em0nM4stodon

                watchfulcitizen@goingdark.socialW 1 Reply Last reply
                0
                • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                  @awalter Where does the data to train the LLM initially come from?

                  phil@fed.bajsicki.comP This user is from outside of this forum
                  phil@fed.bajsicki.comP This user is from outside of this forum
                  phil@fed.bajsicki.com
                  wrote on last edited by
                  #9

                  @Em0nM4stodon@infosec.exchange @awalter@mastodon.bawue.social
                  Does that affect the user's privacy?

                  The LLMs I run locally aren't capable of connecting to the web, so everything I process using them remains on my device.

                  I
                  generally agree the companies producing these models aren't privacy-respecting (insofar as they wish to avoid being fined out the arse for GDPR breaches).

                  I disagree that LLMs themselves are intrinsically incompatible with privacy (please correct me if that's not the intent of your post - that's what I got from it).

                  It's a matter of implementation; when running on my own computer, it's entirely private as far as I'm concerned.

                  When using a vendor, the same truths apply as when running any software on someone else's computer. It's just not private at all.

                  I really don't see why you're hyper-focusing on the LLM part, when the larger privacy invasion is in advertising/ nation-state surveillance. (Have you seen Benn Jordan's video on Flock?)

                  LLMs are an ecological, creative and intellectual disaster, but the privacy concerns are hardly worth mentioning in comparison to pre-existing threats.

                  On a different note, have you checked out Olmo? That's very much a privacy-respecting LLM:
                  https://huggingface.co/allenai/Olmo-3.1-32B-Think

                  1 Reply Last reply
                  0
                  • tmw@ioc.exchangeT tmw@ioc.exchange

                    @watchfulcitizen have you ever created a robots.txt? when you did it, were you thinking it granted anybody a license to steal and regurgitate your content in the form of untraceable homunculus bullshit, in many cases profiting from it?

                    and i'll say it for you: i don't think we should invest in AI until we can figure out what the hell is going on

                    watchfulcitizen@goingdark.socialW This user is from outside of this forum
                    watchfulcitizen@goingdark.socialW This user is from outside of this forum
                    watchfulcitizen@goingdark.social
                    wrote on last edited by
                    #10

                    @tmw not saying I agree with how they handle it. Just want to state that data that are on the public web will always be at a risk of misuse.

                    Sadly I have a hard time seeing them stopping whatever we like it or not

                    1 Reply Last reply
                    0
                    • viq@social.hackerspace.plV viq@social.hackerspace.pl

                      @watchfulcitizen
                      "AI" is currently a useless marketing term, lumping together very different technologies, with very different properties, and implying that just because one of them is useful for a thing, so are the LLMs that everyone lost their minds about. And in either case, nowhere is there any Intelligence to be found.
                      So, right now I very much AM saying, that we should NOT invest in "AI".
                      @Em0nM4stodon

                      watchfulcitizen@goingdark.socialW This user is from outside of this forum
                      watchfulcitizen@goingdark.socialW This user is from outside of this forum
                      watchfulcitizen@goingdark.social
                      wrote on last edited by
                      #11

                      @viq @Em0nM4stodon I agree that the term is very loosely used. Is my vacum "AI" no it is not

                      viq@social.hackerspace.plV 1 Reply Last reply
                      0
                      • guillaumerossolini@infosec.exchangeG guillaumerossolini@infosec.exchange

                        @watchfulcitizen that’s easy, it’s scraping of sources that have pre approved this use, and all the big ones have this kind of agreement (often for a fee)

                        But of course can you trust them to include only data that was contributed with the same agreement, that’s tougher

                        I’m thinking about the crowdsourced Japanese translation of this Mozilla thing, can’t remember the details, they bailed recently and withdrew their contributions when it comes to LLM fair use

                        @Em0nM4stodon

                        watchfulcitizen@goingdark.socialW This user is from outside of this forum
                        watchfulcitizen@goingdark.socialW This user is from outside of this forum
                        watchfulcitizen@goingdark.social
                        wrote on last edited by
                        #12

                        @GuillaumeRossolini @Em0nM4stodon is there a global standard to approve this kind of use case? Asking as I have no knowledge on the subject and would love to learn more.

                        guillaumerossolini@infosec.exchangeG 1 Reply Last reply
                        0
                        • watchfulcitizen@goingdark.socialW watchfulcitizen@goingdark.social

                          @GuillaumeRossolini @Em0nM4stodon is there a global standard to approve this kind of use case? Asking as I have no knowledge on the subject and would love to learn more.

                          guillaumerossolini@infosec.exchangeG This user is from outside of this forum
                          guillaumerossolini@infosec.exchangeG This user is from outside of this forum
                          guillaumerossolini@infosec.exchange
                          wrote on last edited by
                          #13

                          @watchfulcitizen sure, there are several as you might expect, with various degrees of usefulness and no way to enforce any of them

                          @Em0nM4stodon

                          1 Reply Last reply
                          0
                          • watchfulcitizen@goingdark.socialW watchfulcitizen@goingdark.social

                            @viq @Em0nM4stodon I agree that the term is very loosely used. Is my vacum "AI" no it is not

                            viq@social.hackerspace.plV This user is from outside of this forum
                            viq@social.hackerspace.plV This user is from outside of this forum
                            viq@social.hackerspace.pl
                            wrote on last edited by
                            #14

                            @watchfulcitizen @Em0nM4stodon I think with how currently the term is used, it might be 🤷

                            1 Reply Last reply
                            0
                            • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                              There will never be an AI tool that
                              is truly private unless it hasn't trained on nonconsensual data.

                              Even if a platform were able to
                              create the perfect protections for its users' prompts and results,

                              If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

                              How could it be?

                              The company is saying:
                              "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

                              Users thinking they are using a privacy-respectful platform are in fact saying:

                              "Privacy for me and not for thee,"

                              And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

                              Always ask: Where the training data comes from?

                              Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

                              #Privacy #AI #Consent #HumanRights #NoAI

                              pip@infosec.exchangeP This user is from outside of this forum
                              pip@infosec.exchangeP This user is from outside of this forum
                              pip@infosec.exchange
                              wrote on last edited by
                              #15

                              @Em0nM4stodon Ask not whether fashtech is private. Ask why anyone is using fashtech.

                              1 Reply Last reply
                              0
                              • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                                There will never be an AI tool that
                                is truly private unless it hasn't trained on nonconsensual data.

                                Even if a platform were able to
                                create the perfect protections for its users' prompts and results,

                                If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

                                How could it be?

                                The company is saying:
                                "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

                                Users thinking they are using a privacy-respectful platform are in fact saying:

                                "Privacy for me and not for thee,"

                                And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

                                Always ask: Where the training data comes from?

                                Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

                                #Privacy #AI #Consent #HumanRights #NoAI

                                crazyeddie@mastodon.socialC This user is from outside of this forum
                                crazyeddie@mastodon.socialC This user is from outside of this forum
                                crazyeddie@mastodon.social
                                wrote on last edited by
                                #16

                                @Em0nM4stodon "Users thinking they are using a privacy-respectful platform are in fact saying:

                                "Privacy for me and not for thee,""

                                Which is pretty short sighted since they're probably not using that particular platform 24x7, and that makes them fair game for all the other time.

                                1 Reply Last reply
                                0
                                • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange shared this topic
                                • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                                  There will never be an AI tool that
                                  is truly private unless it hasn't trained on nonconsensual data.

                                  Even if a platform were able to
                                  create the perfect protections for its users' prompts and results,

                                  If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

                                  How could it be?

                                  The company is saying:
                                  "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

                                  Users thinking they are using a privacy-respectful platform are in fact saying:

                                  "Privacy for me and not for thee,"

                                  And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

                                  Always ask: Where the training data comes from?

                                  Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

                                  #Privacy #AI #Consent #HumanRights #NoAI

                                  hyperreal@tilde.zoneH This user is from outside of this forum
                                  hyperreal@tilde.zoneH This user is from outside of this forum
                                  hyperreal@tilde.zone
                                  wrote last edited by
                                  #17

                                  @Em0nM4stodon GenAI is fundamentally and inherently built to be exploitative. Even if running locally and trained on "consensual" data. You can't build a language model / AI without the ability to exploit humans.

                                  1 Reply Last reply
                                  0
                                  • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                                    There will never be an AI tool that
                                    is truly private unless it hasn't trained on nonconsensual data.

                                    Even if a platform were able to
                                    create the perfect protections for its users' prompts and results,

                                    If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

                                    How could it be?

                                    The company is saying:
                                    "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

                                    Users thinking they are using a privacy-respectful platform are in fact saying:

                                    "Privacy for me and not for thee,"

                                    And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

                                    Always ask: Where the training data comes from?

                                    Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

                                    #Privacy #AI #Consent #HumanRights #NoAI

                                    martinrust@infosec.exchangeM This user is from outside of this forum
                                    martinrust@infosec.exchangeM This user is from outside of this forum
                                    martinrust@infosec.exchange
                                    wrote last edited by
                                    #18

                                    @Em0nM4stodon phew, I admire you for mastering four negations in one sentence (the first one) – I just cannot, so I tried to understand it by eliminating the negations, hope I'm not distorting your idea with this:
                                    "The only AI tool ever that
                                    is truly private will be trained on consensual data only."
                                    And, yes, I fully agree, any "filters" applied to an ML model as an afterthought are doomed to have leaks that someone, something will find.

                                    em0nm4stodon@infosec.exchangeE 1 Reply Last reply
                                    0
                                    • martinrust@infosec.exchangeM martinrust@infosec.exchange

                                      @Em0nM4stodon phew, I admire you for mastering four negations in one sentence (the first one) – I just cannot, so I tried to understand it by eliminating the negations, hope I'm not distorting your idea with this:
                                      "The only AI tool ever that
                                      is truly private will be trained on consensual data only."
                                      And, yes, I fully agree, any "filters" applied to an ML model as an afterthought are doomed to have leaks that someone, something will find.

                                      em0nm4stodon@infosec.exchangeE This user is from outside of this forum
                                      em0nm4stodon@infosec.exchangeE This user is from outside of this forum
                                      em0nm4stodon@infosec.exchange
                                      wrote last edited by em0nm4stodon@infosec.exchange
                                      #19

                                      @martinrust Hahaha I didn't even realize 😆 I could have written this in a simpler way.

                                      But yes! You understood it correctly! 👍 Only an AI tool trained solely on data obtained ethically (therefore, with consent) could be considered truly private (aka, respecting people's privacy, which also means respecting people's consent if it used their data) 🙌

                                      1 Reply Last reply
                                      0
                                      Reply
                                      • Reply as topic
                                      Log in to reply
                                      • Oldest to Newest
                                      • Newest to Oldest
                                      • Most Votes


                                      • Login

                                      • Login or register to search.
                                      • First post
                                        Last post
                                      0
                                      • Categories
                                      • Recent
                                      • Tags
                                      • Popular
                                      • World
                                      • Users
                                      • Groups