Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. There will never be an AI tool that is truly private unless it hasn't trained on nonconsensual data.

There will never be an AI tool that is truly private unless it hasn't trained on nonconsensual data.

Scheduled Pinned Locked Moved Uncategorized
privacyconsenthumanrightsnoai
19 Posts 12 Posters 6 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

    There will never be an AI tool that
    is truly private unless it hasn't trained on nonconsensual data.

    Even if a platform were able to
    create the perfect protections for its users' prompts and results,

    If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

    How could it be?

    The company is saying:
    "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

    Users thinking they are using a privacy-respectful platform are in fact saying:

    "Privacy for me and not for thee,"

    And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

    Always ask: Where the training data comes from?

    Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

    #Privacy #AI #Consent #HumanRights #NoAI

    watchfulcitizen@goingdark.socialW This user is from outside of this forum
    watchfulcitizen@goingdark.socialW This user is from outside of this forum
    watchfulcitizen@goingdark.social
    wrote on last edited by
    #5

    @Em0nM4stodon well said! An interesting thought however. What's considered ethical scraping? All public data? No scraping at all?

    Respect robots.txt?

    I fully agree with you. Another issue is the lack of transparency from those who train. Its very unknown what data has been used or where it came from.

    I'm not saying we shouldn't invest in AI. But the current form isnt ethical.

    tmw@ioc.exchangeT viq@social.hackerspace.plV guillaumerossolini@infosec.exchangeG 3 Replies Last reply
    0
    • watchfulcitizen@goingdark.socialW watchfulcitizen@goingdark.social

      @Em0nM4stodon well said! An interesting thought however. What's considered ethical scraping? All public data? No scraping at all?

      Respect robots.txt?

      I fully agree with you. Another issue is the lack of transparency from those who train. Its very unknown what data has been used or where it came from.

      I'm not saying we shouldn't invest in AI. But the current form isnt ethical.

      tmw@ioc.exchangeT This user is from outside of this forum
      tmw@ioc.exchangeT This user is from outside of this forum
      tmw@ioc.exchange
      wrote on last edited by
      #6

      @watchfulcitizen have you ever created a robots.txt? when you did it, were you thinking it granted anybody a license to steal and regurgitate your content in the form of untraceable homunculus bullshit, in many cases profiting from it?

      and i'll say it for you: i don't think we should invest in AI until we can figure out what the hell is going on

      watchfulcitizen@goingdark.socialW 1 Reply Last reply
      0
      • watchfulcitizen@goingdark.socialW watchfulcitizen@goingdark.social

        @Em0nM4stodon well said! An interesting thought however. What's considered ethical scraping? All public data? No scraping at all?

        Respect robots.txt?

        I fully agree with you. Another issue is the lack of transparency from those who train. Its very unknown what data has been used or where it came from.

        I'm not saying we shouldn't invest in AI. But the current form isnt ethical.

        viq@social.hackerspace.plV This user is from outside of this forum
        viq@social.hackerspace.plV This user is from outside of this forum
        viq@social.hackerspace.pl
        wrote on last edited by
        #7

        @watchfulcitizen
        "AI" is currently a useless marketing term, lumping together very different technologies, with very different properties, and implying that just because one of them is useful for a thing, so are the LLMs that everyone lost their minds about. And in either case, nowhere is there any Intelligence to be found.
        So, right now I very much AM saying, that we should NOT invest in "AI".
        @Em0nM4stodon

        watchfulcitizen@goingdark.socialW 1 Reply Last reply
        0
        • watchfulcitizen@goingdark.socialW watchfulcitizen@goingdark.social

          @Em0nM4stodon well said! An interesting thought however. What's considered ethical scraping? All public data? No scraping at all?

          Respect robots.txt?

          I fully agree with you. Another issue is the lack of transparency from those who train. Its very unknown what data has been used or where it came from.

          I'm not saying we shouldn't invest in AI. But the current form isnt ethical.

          guillaumerossolini@infosec.exchangeG This user is from outside of this forum
          guillaumerossolini@infosec.exchangeG This user is from outside of this forum
          guillaumerossolini@infosec.exchange
          wrote on last edited by
          #8

          @watchfulcitizen that’s easy, it’s scraping of sources that have pre approved this use, and all the big ones have this kind of agreement (often for a fee)

          But of course can you trust them to include only data that was contributed with the same agreement, that’s tougher

          I’m thinking about the crowdsourced Japanese translation of this Mozilla thing, can’t remember the details, they bailed recently and withdrew their contributions when it comes to LLM fair use

          @Em0nM4stodon

          watchfulcitizen@goingdark.socialW 1 Reply Last reply
          0
          • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

            @awalter Where does the data to train the LLM initially come from?

            phil@fed.bajsicki.comP This user is from outside of this forum
            phil@fed.bajsicki.comP This user is from outside of this forum
            phil@fed.bajsicki.com
            wrote on last edited by
            #9

            @Em0nM4stodon@infosec.exchange @awalter@mastodon.bawue.social
            Does that affect the user's privacy?

            The LLMs I run locally aren't capable of connecting to the web, so everything I process using them remains on my device.

            I
            generally agree the companies producing these models aren't privacy-respecting (insofar as they wish to avoid being fined out the arse for GDPR breaches).

            I disagree that LLMs themselves are intrinsically incompatible with privacy (please correct me if that's not the intent of your post - that's what I got from it).

            It's a matter of implementation; when running on my own computer, it's entirely private as far as I'm concerned.

            When using a vendor, the same truths apply as when running any software on someone else's computer. It's just not private at all.

            I really don't see why you're hyper-focusing on the LLM part, when the larger privacy invasion is in advertising/ nation-state surveillance. (Have you seen Benn Jordan's video on Flock?)

            LLMs are an ecological, creative and intellectual disaster, but the privacy concerns are hardly worth mentioning in comparison to pre-existing threats.

            On a different note, have you checked out Olmo? That's very much a privacy-respecting LLM:
            https://huggingface.co/allenai/Olmo-3.1-32B-Think

            1 Reply Last reply
            0
            • tmw@ioc.exchangeT tmw@ioc.exchange

              @watchfulcitizen have you ever created a robots.txt? when you did it, were you thinking it granted anybody a license to steal and regurgitate your content in the form of untraceable homunculus bullshit, in many cases profiting from it?

              and i'll say it for you: i don't think we should invest in AI until we can figure out what the hell is going on

              watchfulcitizen@goingdark.socialW This user is from outside of this forum
              watchfulcitizen@goingdark.socialW This user is from outside of this forum
              watchfulcitizen@goingdark.social
              wrote on last edited by
              #10

              @tmw not saying I agree with how they handle it. Just want to state that data that are on the public web will always be at a risk of misuse.

              Sadly I have a hard time seeing them stopping whatever we like it or not

              1 Reply Last reply
              0
              • viq@social.hackerspace.plV viq@social.hackerspace.pl

                @watchfulcitizen
                "AI" is currently a useless marketing term, lumping together very different technologies, with very different properties, and implying that just because one of them is useful for a thing, so are the LLMs that everyone lost their minds about. And in either case, nowhere is there any Intelligence to be found.
                So, right now I very much AM saying, that we should NOT invest in "AI".
                @Em0nM4stodon

                watchfulcitizen@goingdark.socialW This user is from outside of this forum
                watchfulcitizen@goingdark.socialW This user is from outside of this forum
                watchfulcitizen@goingdark.social
                wrote on last edited by
                #11

                @viq @Em0nM4stodon I agree that the term is very loosely used. Is my vacum "AI" no it is not

                viq@social.hackerspace.plV 1 Reply Last reply
                0
                • guillaumerossolini@infosec.exchangeG guillaumerossolini@infosec.exchange

                  @watchfulcitizen that’s easy, it’s scraping of sources that have pre approved this use, and all the big ones have this kind of agreement (often for a fee)

                  But of course can you trust them to include only data that was contributed with the same agreement, that’s tougher

                  I’m thinking about the crowdsourced Japanese translation of this Mozilla thing, can’t remember the details, they bailed recently and withdrew their contributions when it comes to LLM fair use

                  @Em0nM4stodon

                  watchfulcitizen@goingdark.socialW This user is from outside of this forum
                  watchfulcitizen@goingdark.socialW This user is from outside of this forum
                  watchfulcitizen@goingdark.social
                  wrote on last edited by
                  #12

                  @GuillaumeRossolini @Em0nM4stodon is there a global standard to approve this kind of use case? Asking as I have no knowledge on the subject and would love to learn more.

                  guillaumerossolini@infosec.exchangeG 1 Reply Last reply
                  0
                  • watchfulcitizen@goingdark.socialW watchfulcitizen@goingdark.social

                    @GuillaumeRossolini @Em0nM4stodon is there a global standard to approve this kind of use case? Asking as I have no knowledge on the subject and would love to learn more.

                    guillaumerossolini@infosec.exchangeG This user is from outside of this forum
                    guillaumerossolini@infosec.exchangeG This user is from outside of this forum
                    guillaumerossolini@infosec.exchange
                    wrote on last edited by
                    #13

                    @watchfulcitizen sure, there are several as you might expect, with various degrees of usefulness and no way to enforce any of them

                    @Em0nM4stodon

                    1 Reply Last reply
                    0
                    • watchfulcitizen@goingdark.socialW watchfulcitizen@goingdark.social

                      @viq @Em0nM4stodon I agree that the term is very loosely used. Is my vacum "AI" no it is not

                      viq@social.hackerspace.plV This user is from outside of this forum
                      viq@social.hackerspace.plV This user is from outside of this forum
                      viq@social.hackerspace.pl
                      wrote on last edited by
                      #14

                      @watchfulcitizen @Em0nM4stodon I think with how currently the term is used, it might be 🤷

                      1 Reply Last reply
                      0
                      • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                        There will never be an AI tool that
                        is truly private unless it hasn't trained on nonconsensual data.

                        Even if a platform were able to
                        create the perfect protections for its users' prompts and results,

                        If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

                        How could it be?

                        The company is saying:
                        "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

                        Users thinking they are using a privacy-respectful platform are in fact saying:

                        "Privacy for me and not for thee,"

                        And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

                        Always ask: Where the training data comes from?

                        Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

                        #Privacy #AI #Consent #HumanRights #NoAI

                        pip@infosec.exchangeP This user is from outside of this forum
                        pip@infosec.exchangeP This user is from outside of this forum
                        pip@infosec.exchange
                        wrote on last edited by
                        #15

                        @Em0nM4stodon Ask not whether fashtech is private. Ask why anyone is using fashtech.

                        1 Reply Last reply
                        0
                        • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                          There will never be an AI tool that
                          is truly private unless it hasn't trained on nonconsensual data.

                          Even if a platform were able to
                          create the perfect protections for its users' prompts and results,

                          If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

                          How could it be?

                          The company is saying:
                          "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

                          Users thinking they are using a privacy-respectful platform are in fact saying:

                          "Privacy for me and not for thee,"

                          And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

                          Always ask: Where the training data comes from?

                          Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

                          #Privacy #AI #Consent #HumanRights #NoAI

                          crazyeddie@mastodon.socialC This user is from outside of this forum
                          crazyeddie@mastodon.socialC This user is from outside of this forum
                          crazyeddie@mastodon.social
                          wrote on last edited by
                          #16

                          @Em0nM4stodon "Users thinking they are using a privacy-respectful platform are in fact saying:

                          "Privacy for me and not for thee,""

                          Which is pretty short sighted since they're probably not using that particular platform 24x7, and that makes them fair game for all the other time.

                          1 Reply Last reply
                          0
                          • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange shared this topic
                          • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                            There will never be an AI tool that
                            is truly private unless it hasn't trained on nonconsensual data.

                            Even if a platform were able to
                            create the perfect protections for its users' prompts and results,

                            If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

                            How could it be?

                            The company is saying:
                            "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

                            Users thinking they are using a privacy-respectful platform are in fact saying:

                            "Privacy for me and not for thee,"

                            And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

                            Always ask: Where the training data comes from?

                            Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

                            #Privacy #AI #Consent #HumanRights #NoAI

                            hyperreal@tilde.zoneH This user is from outside of this forum
                            hyperreal@tilde.zoneH This user is from outside of this forum
                            hyperreal@tilde.zone
                            wrote last edited by
                            #17

                            @Em0nM4stodon GenAI is fundamentally and inherently built to be exploitative. Even if running locally and trained on "consensual" data. You can't build a language model / AI without the ability to exploit humans.

                            1 Reply Last reply
                            0
                            • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                              There will never be an AI tool that
                              is truly private unless it hasn't trained on nonconsensual data.

                              Even if a platform were able to
                              create the perfect protections for its users' prompts and results,

                              If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

                              How could it be?

                              The company is saying:
                              "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

                              Users thinking they are using a privacy-respectful platform are in fact saying:

                              "Privacy for me and not for thee,"

                              And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

                              Always ask: Where the training data comes from?

                              Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

                              #Privacy #AI #Consent #HumanRights #NoAI

                              martinrust@infosec.exchangeM This user is from outside of this forum
                              martinrust@infosec.exchangeM This user is from outside of this forum
                              martinrust@infosec.exchange
                              wrote last edited by
                              #18

                              @Em0nM4stodon phew, I admire you for mastering four negations in one sentence (the first one) – I just cannot, so I tried to understand it by eliminating the negations, hope I'm not distorting your idea with this:
                              "The only AI tool ever that
                              is truly private will be trained on consensual data only."
                              And, yes, I fully agree, any "filters" applied to an ML model as an afterthought are doomed to have leaks that someone, something will find.

                              em0nm4stodon@infosec.exchangeE 1 Reply Last reply
                              0
                              • martinrust@infosec.exchangeM martinrust@infosec.exchange

                                @Em0nM4stodon phew, I admire you for mastering four negations in one sentence (the first one) – I just cannot, so I tried to understand it by eliminating the negations, hope I'm not distorting your idea with this:
                                "The only AI tool ever that
                                is truly private will be trained on consensual data only."
                                And, yes, I fully agree, any "filters" applied to an ML model as an afterthought are doomed to have leaks that someone, something will find.

                                em0nm4stodon@infosec.exchangeE This user is from outside of this forum
                                em0nm4stodon@infosec.exchangeE This user is from outside of this forum
                                em0nm4stodon@infosec.exchange
                                wrote last edited by em0nm4stodon@infosec.exchange
                                #19

                                @martinrust Hahaha I didn't even realize 😆 I could have written this in a simpler way.

                                But yes! You understood it correctly! 👍 Only an AI tool trained solely on data obtained ethically (therefore, with consent) could be considered truly private (aka, respecting people's privacy, which also means respecting people's consent if it used their data) 🙌

                                1 Reply Last reply
                                0
                                Reply
                                • Reply as topic
                                Log in to reply
                                • Oldest to Newest
                                • Newest to Oldest
                                • Most Votes


                                • Login

                                • Login or register to search.
                                • First post
                                  Last post
                                0
                                • Categories
                                • Recent
                                • Tags
                                • Popular
                                • World
                                • Users
                                • Groups