Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. There will never be an AI tool that is truly private unless it hasn't trained on nonconsensual data.

There will never be an AI tool that is truly private unless it hasn't trained on nonconsensual data.

Scheduled Pinned Locked Moved Uncategorized
privacyconsenthumanrightsnoai
19 Posts 12 Posters 6 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

    @awalter Where does the data to train the LLM initially come from?

    phil@fed.bajsicki.comP This user is from outside of this forum
    phil@fed.bajsicki.comP This user is from outside of this forum
    phil@fed.bajsicki.com
    wrote on last edited by
    #9

    @Em0nM4stodon@infosec.exchange @awalter@mastodon.bawue.social
    Does that affect the user's privacy?

    The LLMs I run locally aren't capable of connecting to the web, so everything I process using them remains on my device.

    I
    generally agree the companies producing these models aren't privacy-respecting (insofar as they wish to avoid being fined out the arse for GDPR breaches).

    I disagree that LLMs themselves are intrinsically incompatible with privacy (please correct me if that's not the intent of your post - that's what I got from it).

    It's a matter of implementation; when running on my own computer, it's entirely private as far as I'm concerned.

    When using a vendor, the same truths apply as when running any software on someone else's computer. It's just not private at all.

    I really don't see why you're hyper-focusing on the LLM part, when the larger privacy invasion is in advertising/ nation-state surveillance. (Have you seen Benn Jordan's video on Flock?)

    LLMs are an ecological, creative and intellectual disaster, but the privacy concerns are hardly worth mentioning in comparison to pre-existing threats.

    On a different note, have you checked out Olmo? That's very much a privacy-respecting LLM:
    https://huggingface.co/allenai/Olmo-3.1-32B-Think

    1 Reply Last reply
    0
    • tmw@ioc.exchangeT tmw@ioc.exchange

      @watchfulcitizen have you ever created a robots.txt? when you did it, were you thinking it granted anybody a license to steal and regurgitate your content in the form of untraceable homunculus bullshit, in many cases profiting from it?

      and i'll say it for you: i don't think we should invest in AI until we can figure out what the hell is going on

      watchfulcitizen@goingdark.socialW This user is from outside of this forum
      watchfulcitizen@goingdark.socialW This user is from outside of this forum
      watchfulcitizen@goingdark.social
      wrote on last edited by
      #10

      @tmw not saying I agree with how they handle it. Just want to state that data that are on the public web will always be at a risk of misuse.

      Sadly I have a hard time seeing them stopping whatever we like it or not

      1 Reply Last reply
      0
      • viq@social.hackerspace.plV viq@social.hackerspace.pl

        @watchfulcitizen
        "AI" is currently a useless marketing term, lumping together very different technologies, with very different properties, and implying that just because one of them is useful for a thing, so are the LLMs that everyone lost their minds about. And in either case, nowhere is there any Intelligence to be found.
        So, right now I very much AM saying, that we should NOT invest in "AI".
        @Em0nM4stodon

        watchfulcitizen@goingdark.socialW This user is from outside of this forum
        watchfulcitizen@goingdark.socialW This user is from outside of this forum
        watchfulcitizen@goingdark.social
        wrote on last edited by
        #11

        @viq @Em0nM4stodon I agree that the term is very loosely used. Is my vacum "AI" no it is not

        viq@social.hackerspace.plV 1 Reply Last reply
        0
        • guillaumerossolini@infosec.exchangeG guillaumerossolini@infosec.exchange

          @watchfulcitizen that’s easy, it’s scraping of sources that have pre approved this use, and all the big ones have this kind of agreement (often for a fee)

          But of course can you trust them to include only data that was contributed with the same agreement, that’s tougher

          I’m thinking about the crowdsourced Japanese translation of this Mozilla thing, can’t remember the details, they bailed recently and withdrew their contributions when it comes to LLM fair use

          @Em0nM4stodon

          watchfulcitizen@goingdark.socialW This user is from outside of this forum
          watchfulcitizen@goingdark.socialW This user is from outside of this forum
          watchfulcitizen@goingdark.social
          wrote on last edited by
          #12

          @GuillaumeRossolini @Em0nM4stodon is there a global standard to approve this kind of use case? Asking as I have no knowledge on the subject and would love to learn more.

          guillaumerossolini@infosec.exchangeG 1 Reply Last reply
          0
          • watchfulcitizen@goingdark.socialW watchfulcitizen@goingdark.social

            @GuillaumeRossolini @Em0nM4stodon is there a global standard to approve this kind of use case? Asking as I have no knowledge on the subject and would love to learn more.

            guillaumerossolini@infosec.exchangeG This user is from outside of this forum
            guillaumerossolini@infosec.exchangeG This user is from outside of this forum
            guillaumerossolini@infosec.exchange
            wrote on last edited by
            #13

            @watchfulcitizen sure, there are several as you might expect, with various degrees of usefulness and no way to enforce any of them

            @Em0nM4stodon

            1 Reply Last reply
            0
            • watchfulcitizen@goingdark.socialW watchfulcitizen@goingdark.social

              @viq @Em0nM4stodon I agree that the term is very loosely used. Is my vacum "AI" no it is not

              viq@social.hackerspace.plV This user is from outside of this forum
              viq@social.hackerspace.plV This user is from outside of this forum
              viq@social.hackerspace.pl
              wrote on last edited by
              #14

              @watchfulcitizen @Em0nM4stodon I think with how currently the term is used, it might be 🤷

              1 Reply Last reply
              0
              • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                There will never be an AI tool that
                is truly private unless it hasn't trained on nonconsensual data.

                Even if a platform were able to
                create the perfect protections for its users' prompts and results,

                If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

                How could it be?

                The company is saying:
                "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

                Users thinking they are using a privacy-respectful platform are in fact saying:

                "Privacy for me and not for thee,"

                And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

                Always ask: Where the training data comes from?

                Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

                #Privacy #AI #Consent #HumanRights #NoAI

                pip@infosec.exchangeP This user is from outside of this forum
                pip@infosec.exchangeP This user is from outside of this forum
                pip@infosec.exchange
                wrote on last edited by
                #15

                @Em0nM4stodon Ask not whether fashtech is private. Ask why anyone is using fashtech.

                1 Reply Last reply
                0
                • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                  There will never be an AI tool that
                  is truly private unless it hasn't trained on nonconsensual data.

                  Even if a platform were able to
                  create the perfect protections for its users' prompts and results,

                  If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

                  How could it be?

                  The company is saying:
                  "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

                  Users thinking they are using a privacy-respectful platform are in fact saying:

                  "Privacy for me and not for thee,"

                  And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

                  Always ask: Where the training data comes from?

                  Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

                  #Privacy #AI #Consent #HumanRights #NoAI

                  crazyeddie@mastodon.socialC This user is from outside of this forum
                  crazyeddie@mastodon.socialC This user is from outside of this forum
                  crazyeddie@mastodon.social
                  wrote on last edited by
                  #16

                  @Em0nM4stodon "Users thinking they are using a privacy-respectful platform are in fact saying:

                  "Privacy for me and not for thee,""

                  Which is pretty short sighted since they're probably not using that particular platform 24x7, and that makes them fair game for all the other time.

                  1 Reply Last reply
                  0
                  • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange shared this topic
                  • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                    There will never be an AI tool that
                    is truly private unless it hasn't trained on nonconsensual data.

                    Even if a platform were able to
                    create the perfect protections for its users' prompts and results,

                    If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

                    How could it be?

                    The company is saying:
                    "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

                    Users thinking they are using a privacy-respectful platform are in fact saying:

                    "Privacy for me and not for thee,"

                    And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

                    Always ask: Where the training data comes from?

                    Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

                    #Privacy #AI #Consent #HumanRights #NoAI

                    hyperreal@tilde.zoneH This user is from outside of this forum
                    hyperreal@tilde.zoneH This user is from outside of this forum
                    hyperreal@tilde.zone
                    wrote last edited by
                    #17

                    @Em0nM4stodon GenAI is fundamentally and inherently built to be exploitative. Even if running locally and trained on "consensual" data. You can't build a language model / AI without the ability to exploit humans.

                    1 Reply Last reply
                    0
                    • em0nm4stodon@infosec.exchangeE em0nm4stodon@infosec.exchange

                      There will never be an AI tool that
                      is truly private unless it hasn't trained on nonconsensual data.

                      Even if a platform were able to
                      create the perfect protections for its users' prompts and results,

                      If the platform is built from or utilizing an AI model that was trained on or is updated and optimized with data that was scraped from millions of people without their consent, then of course this platform isn't "privacy-respectful."

                      How could it be?

                      The company is saying:
                      "We respect the privacy of our users while they are using our platform, but outside of it, it's fair game."

                      Users thinking they are using a privacy-respectful platform are in fact saying:

                      "Privacy for me and not for thee,"

                      And are directly contributing to the platform needing to scrape even more nonconsensual data to improve.

                      Always ask: Where the training data comes from?

                      Without the assurance that a platform only uses AI models that have only been training on data acquired ethically, it is not a privacy-respectful platform.

                      #Privacy #AI #Consent #HumanRights #NoAI

                      martinrust@infosec.exchangeM This user is from outside of this forum
                      martinrust@infosec.exchangeM This user is from outside of this forum
                      martinrust@infosec.exchange
                      wrote last edited by
                      #18

                      @Em0nM4stodon phew, I admire you for mastering four negations in one sentence (the first one) – I just cannot, so I tried to understand it by eliminating the negations, hope I'm not distorting your idea with this:
                      "The only AI tool ever that
                      is truly private will be trained on consensual data only."
                      And, yes, I fully agree, any "filters" applied to an ML model as an afterthought are doomed to have leaks that someone, something will find.

                      em0nm4stodon@infosec.exchangeE 1 Reply Last reply
                      0
                      • martinrust@infosec.exchangeM martinrust@infosec.exchange

                        @Em0nM4stodon phew, I admire you for mastering four negations in one sentence (the first one) – I just cannot, so I tried to understand it by eliminating the negations, hope I'm not distorting your idea with this:
                        "The only AI tool ever that
                        is truly private will be trained on consensual data only."
                        And, yes, I fully agree, any "filters" applied to an ML model as an afterthought are doomed to have leaks that someone, something will find.

                        em0nm4stodon@infosec.exchangeE This user is from outside of this forum
                        em0nm4stodon@infosec.exchangeE This user is from outside of this forum
                        em0nm4stodon@infosec.exchange
                        wrote last edited by em0nm4stodon@infosec.exchange
                        #19

                        @martinrust Hahaha I didn't even realize 😆 I could have written this in a simpler way.

                        But yes! You understood it correctly! 👍 Only an AI tool trained solely on data obtained ethically (therefore, with consent) could be considered truly private (aka, respecting people's privacy, which also means respecting people's consent if it used their data) 🙌

                        1 Reply Last reply
                        0
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups