Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Google Search rests on a social contract: their bots can crawl our sites, they can index our sites, and they can show excerpts of our sites because

Google Search rests on a social contract: their bots can crawl our sites, they can index our sites, and they can show excerpts of our sites because

Scheduled Pinned Locked Moved Uncategorized
127 Posts 82 Posters 1 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • inthehands@hachyderm.ioI inthehands@hachyderm.io

    RE: https://tldr.nettime.org/@tante/116605858023186072

    Google Search rests on a social contract: their bots can crawl our sites, they can index our sites, and they can show excerpts of our sites because

    and •only because•

    they send people to our sites. •Our• sites, our words, with our design, with our links, with our context and our aesthetics, shared the way we want to share them.

    Google is announcing — unambiguously and with great fanfare — that are fully breaking that contract. We should reciprocate.

    1/2

    datarama@hachyderm.ioD This user is from outside of this forum
    datarama@hachyderm.ioD This user is from outside of this forum
    datarama@hachyderm.io
    wrote last edited by
    #2

    @inthehands As I said just a while ago: Every big tech press event these last few years have felt like "Announcing our exciting plans for oligarchs to strip-mine the entire world and immiserate all of humanity! Get on board, and also death to the unbelievers!"

    npars01@mstdn.socialN 1 Reply Last reply
    0
    • inthehands@hachyderm.ioI inthehands@hachyderm.io

      RE: https://tldr.nettime.org/@tante/116605858023186072

      Google Search rests on a social contract: their bots can crawl our sites, they can index our sites, and they can show excerpts of our sites because

      and •only because•

      they send people to our sites. •Our• sites, our words, with our design, with our links, with our context and our aesthetics, shared the way we want to share them.

      Google is announcing — unambiguously and with great fanfare — that are fully breaking that contract. We should reciprocate.

      1/2

      inthehands@hachyderm.ioI This user is from outside of this forum
      inthehands@hachyderm.ioI This user is from outside of this forum
      inthehands@hachyderm.io
      wrote last edited by
      #3

      Quick strategy discussion, for those who understand Google indexing and SEO:

      If I want to yank a web site out of Google’s now-fully-extractive search, should I (1) disallow googlebot in robots.txt or (2) add `<meta name="googlebot" content="noindex">` to all the page headers?

      The goal here is not just to remove my contributions to the commons from Google’s results, but to •make Google aware• that sites are pulling consent. What will best do that?

      2/2

      inthehands@hachyderm.ioI adamshostack@infosec.exchangeA joe@f.duriansoftware.comJ be_far@social.treehouse.systemsB S 18 Replies Last reply
      1
      0
      • inthehands@hachyderm.ioI inthehands@hachyderm.io

        Quick strategy discussion, for those who understand Google indexing and SEO:

        If I want to yank a web site out of Google’s now-fully-extractive search, should I (1) disallow googlebot in robots.txt or (2) add `<meta name="googlebot" content="noindex">` to all the page headers?

        The goal here is not just to remove my contributions to the commons from Google’s results, but to •make Google aware• that sites are pulling consent. What will best do that?

        2/2

        inthehands@hachyderm.ioI This user is from outside of this forum
        inthehands@hachyderm.ioI This user is from outside of this forum
        inthehands@hachyderm.io
        wrote last edited by
        #4

        Same question as the previous post, except for Wkipedia. What would you like to see them do to send a shot across the bow?

        Or…well, it’s Wikipedia. Maybe more like a shot to the hull.

        3/2

        inthehands@hachyderm.ioI 1 Reply Last reply
        0
        • R relay@relay.mycrowd.ca shared this topic
        • inthehands@hachyderm.ioI inthehands@hachyderm.io

          Quick strategy discussion, for those who understand Google indexing and SEO:

          If I want to yank a web site out of Google’s now-fully-extractive search, should I (1) disallow googlebot in robots.txt or (2) add `<meta name="googlebot" content="noindex">` to all the page headers?

          The goal here is not just to remove my contributions to the commons from Google’s results, but to •make Google aware• that sites are pulling consent. What will best do that?

          2/2

          adamshostack@infosec.exchangeA This user is from outside of this forum
          adamshostack@infosec.exchangeA This user is from outside of this forum
          adamshostack@infosec.exchange
          wrote last edited by
          #5

          @inthehands (3) sue on the basis that’s it’s not fair use, and these derivative works clearly have a dramatic impact on the value of the original site

          inthehands@hachyderm.ioI 1 Reply Last reply
          0
          • inthehands@hachyderm.ioI inthehands@hachyderm.io

            Quick strategy discussion, for those who understand Google indexing and SEO:

            If I want to yank a web site out of Google’s now-fully-extractive search, should I (1) disallow googlebot in robots.txt or (2) add `<meta name="googlebot" content="noindex">` to all the page headers?

            The goal here is not just to remove my contributions to the commons from Google’s results, but to •make Google aware• that sites are pulling consent. What will best do that?

            2/2

            joe@f.duriansoftware.comJ This user is from outside of this forum
            joe@f.duriansoftware.comJ This user is from outside of this forum
            joe@f.duriansoftware.com
            wrote last edited by
            #6

            @inthehands is "serve LLM poison to googlebot user-agents" on the table

            inthehands@hachyderm.ioI 1 Reply Last reply
            0
            • adamshostack@infosec.exchangeA adamshostack@infosec.exchange

              @inthehands (3) sue on the basis that’s it’s not fair use, and these derivative works clearly have a dramatic impact on the value of the original site

              inthehands@hachyderm.ioI This user is from outside of this forum
              inthehands@hachyderm.ioI This user is from outside of this forum
              inthehands@hachyderm.io
              wrote last edited by
              #7

              @adamshostack

              This is clearly how copyright law as written •should• work. Not sure if it’s how it •does• work, but if anybody’s trying, they have my sword.

              pixx@merveilles.townP sennoma@chaos.socialS ferrix@mastodon.onlineF 3 Replies Last reply
              0
              • joe@f.duriansoftware.comJ joe@f.duriansoftware.com

                @inthehands is "serve LLM poison to googlebot user-agents" on the table

                inthehands@hachyderm.ioI This user is from outside of this forum
                inthehands@hachyderm.ioI This user is from outside of this forum
                inthehands@hachyderm.io
                wrote last edited by
                #8

                @joe
                It is and some of us miiiiight already be doing it.

                joe@f.duriansoftware.comJ groupnebula563@mastodon.socialG 2 Replies Last reply
                0
                • inthehands@hachyderm.ioI inthehands@hachyderm.io

                  Quick strategy discussion, for those who understand Google indexing and SEO:

                  If I want to yank a web site out of Google’s now-fully-extractive search, should I (1) disallow googlebot in robots.txt or (2) add `<meta name="googlebot" content="noindex">` to all the page headers?

                  The goal here is not just to remove my contributions to the commons from Google’s results, but to •make Google aware• that sites are pulling consent. What will best do that?

                  2/2

                  be_far@social.treehouse.systemsB This user is from outside of this forum
                  be_far@social.treehouse.systemsB This user is from outside of this forum
                  be_far@social.treehouse.systems
                  wrote last edited by
                  #9

                  @inthehands I’m planning to return 418 I’m a teapot to googlebot requests. Don’t try and brew your coffee with my teapot. context

                  1 Reply Last reply
                  0
                  • inthehands@hachyderm.ioI inthehands@hachyderm.io

                    Quick strategy discussion, for those who understand Google indexing and SEO:

                    If I want to yank a web site out of Google’s now-fully-extractive search, should I (1) disallow googlebot in robots.txt or (2) add `<meta name="googlebot" content="noindex">` to all the page headers?

                    The goal here is not just to remove my contributions to the commons from Google’s results, but to •make Google aware• that sites are pulling consent. What will best do that?

                    2/2

                    S This user is from outside of this forum
                    S This user is from outside of this forum
                    shadsterling@mastodon.social
                    wrote last edited by
                    #10

                    @inthehands both, and, when the agent matches the heuristics to be recognized as Google (et. al.), send a different response that contains only an explanation of the ban (and maybe some poison for their next model)

                    1 Reply Last reply
                    0
                    • inthehands@hachyderm.ioI inthehands@hachyderm.io

                      RE: https://tldr.nettime.org/@tante/116605858023186072

                      Google Search rests on a social contract: their bots can crawl our sites, they can index our sites, and they can show excerpts of our sites because

                      and •only because•

                      they send people to our sites. •Our• sites, our words, with our design, with our links, with our context and our aesthetics, shared the way we want to share them.

                      Google is announcing — unambiguously and with great fanfare — that are fully breaking that contract. We should reciprocate.

                      1/2

                      mjd@mathstodon.xyzM This user is from outside of this forum
                      mjd@mathstodon.xyzM This user is from outside of this forum
                      mjd@mathstodon.xyz
                      wrote last edited by
                      #11

                      @cceckman The contract I thought I was signing was this: I published my stuff on a worldwide information network, with no controls whatever, specifically so that anyone anywhere could access it. I did that with full understanding that it would enable people I might not like to read, copy, and share it and put it to uses that I couldn't foresee and might not approve of. And if I didn't want to entertain that possibility I should not have installed a program on my computer whose sole purpose was to deliver of my stuff to any rando who asked for it.

                      I'm not saying I got a good deal, or that I'm happy with the outcome. But I'm not going to pretend I was tricked or that Google reneged on a bargain. We had no bargain. I served them the stuff anyway, whenever they asked for it.

                      And I'm not sure I believe Paul Cantrell when he says he thought the contract was different from what I said.

                      wronglang@bayes.clubW cceckman@hachyderm.ioC donaldball@triangletoot.partyD williamoconnell@mas.toW theothersimo@mastodon.socialT 6 Replies Last reply
                      0
                      • mjd@mathstodon.xyzM mjd@mathstodon.xyz

                        @cceckman The contract I thought I was signing was this: I published my stuff on a worldwide information network, with no controls whatever, specifically so that anyone anywhere could access it. I did that with full understanding that it would enable people I might not like to read, copy, and share it and put it to uses that I couldn't foresee and might not approve of. And if I didn't want to entertain that possibility I should not have installed a program on my computer whose sole purpose was to deliver of my stuff to any rando who asked for it.

                        I'm not saying I got a good deal, or that I'm happy with the outcome. But I'm not going to pretend I was tricked or that Google reneged on a bargain. We had no bargain. I served them the stuff anyway, whenever they asked for it.

                        And I'm not sure I believe Paul Cantrell when he says he thought the contract was different from what I said.

                        wronglang@bayes.clubW This user is from outside of this forum
                        wronglang@bayes.clubW This user is from outside of this forum
                        wronglang@bayes.club
                        wrote last edited by
                        #12

                        @mjd @cceckman he's talking about a social contract

                        S 1 Reply Last reply
                        0
                        • inthehands@hachyderm.ioI inthehands@hachyderm.io

                          Same question as the previous post, except for Wkipedia. What would you like to see them do to send a shot across the bow?

                          Or…well, it’s Wikipedia. Maybe more like a shot to the hull.

                          3/2

                          inthehands@hachyderm.ioI This user is from outside of this forum
                          inthehands@hachyderm.ioI This user is from outside of this forum
                          inthehands@hachyderm.io
                          wrote last edited by
                          #13

                          Going with meta noindex for now. My thinking is that this actively tells Google to yank already-crawled content from their index, whereas they might take a robots.txt entry to mean “do not update, but keep showing last fetched.”

                          shadowjonathan@tech.lgbtS lunaphied@provably.onlineL qurlyjoe@mstdn.socialQ inthehands@hachyderm.ioI korrupt@nrw.socialK 6 Replies Last reply
                          0
                          • inthehands@hachyderm.ioI inthehands@hachyderm.io

                            Quick strategy discussion, for those who understand Google indexing and SEO:

                            If I want to yank a web site out of Google’s now-fully-extractive search, should I (1) disallow googlebot in robots.txt or (2) add `<meta name="googlebot" content="noindex">` to all the page headers?

                            The goal here is not just to remove my contributions to the commons from Google’s results, but to •make Google aware• that sites are pulling consent. What will best do that?

                            2/2

                            elexia@twoot.siteE This user is from outside of this forum
                            elexia@twoot.siteE This user is from outside of this forum
                            elexia@twoot.site
                            wrote last edited by
                            #14

                            @inthehands if they decide that people doing this hurts their business model they will simply stop respecting things like robots.txt. their gamble is that people rely on Google more than they do on other websites and if they have to kill the rest of the web to monopolize access to information they will.

                            inthehands@hachyderm.ioI 1 Reply Last reply
                            0
                            • inthehands@hachyderm.ioI inthehands@hachyderm.io

                              Quick strategy discussion, for those who understand Google indexing and SEO:

                              If I want to yank a web site out of Google’s now-fully-extractive search, should I (1) disallow googlebot in robots.txt or (2) add `<meta name="googlebot" content="noindex">` to all the page headers?

                              The goal here is not just to remove my contributions to the commons from Google’s results, but to •make Google aware• that sites are pulling consent. What will best do that?

                              2/2

                              maddiem4@raphus.socialM This user is from outside of this forum
                              maddiem4@raphus.socialM This user is from outside of this forum
                              maddiem4@raphus.social
                              wrote last edited by
                              #15

                              @inthehands while this can (and probably should) be done in tandem with other strategies, one of the most unambiguous ways you can express your disdain is in robots.txt. Google has historically respected it mechanically (in the present and future, I'm not sure this will hold), and it supports line comments with # so you can explain in plain English what you think about them.

                              https://developers.google.com/search/docs/crawling-indexing/robots/intro

                              The docs also mention the 'noindex' meta tag and how you probably want to use one or the other but not both. That's worth a little research, probably.

                              1 Reply Last reply
                              0
                              • elexia@twoot.siteE elexia@twoot.site

                                @inthehands if they decide that people doing this hurts their business model they will simply stop respecting things like robots.txt. their gamble is that people rely on Google more than they do on other websites and if they have to kill the rest of the web to monopolize access to information they will.

                                inthehands@hachyderm.ioI This user is from outside of this forum
                                inthehands@hachyderm.ioI This user is from outside of this forum
                                inthehands@hachyderm.io
                                wrote last edited by
                                #16

                                @elexia

                                Of course, but it is important to force that fight rather than capitulating in advance.

                                1 Reply Last reply
                                0
                                • inthehands@hachyderm.ioI inthehands@hachyderm.io

                                  @joe
                                  It is and some of us miiiiight already be doing it.

                                  joe@f.duriansoftware.comJ This user is from outside of this forum
                                  joe@f.duriansoftware.comJ This user is from outside of this forum
                                  joe@f.duriansoftware.com
                                  wrote last edited by
                                  #17

                                  @inthehands given how eager their summarizer is to incorporate "facts" from even unintentionally adversarial recent posts like satirical blogs, it seems like it wouldn't take much of a coordinated effort to reduce their result quality this way

                                  S 1 Reply Last reply
                                  0
                                  • wronglang@bayes.clubW wronglang@bayes.club

                                    @mjd @cceckman he's talking about a social contract

                                    S This user is from outside of this forum
                                    S This user is from outside of this forum
                                    shadsterling@mastodon.social
                                    wrote last edited by
                                    #18

                                    @wronglang @mjd @cceckman this sort of discrepancy is why I’ve never liked the term “social contract” - it’s nothing like a “contract”

                                    wronglang@bayes.clubW 1 Reply Last reply
                                    0
                                    • joe@f.duriansoftware.comJ joe@f.duriansoftware.com

                                      @inthehands given how eager their summarizer is to incorporate "facts" from even unintentionally adversarial recent posts like satirical blogs, it seems like it wouldn't take much of a coordinated effort to reduce their result quality this way

                                      S This user is from outside of this forum
                                      S This user is from outside of this forum
                                      shadsterling@mastodon.social
                                      wrote last edited by
                                      #19

                                      @joe @inthehands is there a coordinated effort that has a website? And/or server plugins that automate serving coordinated poison?

                                      joe@f.duriansoftware.comJ 1 Reply Last reply
                                      0
                                      • inthehands@hachyderm.ioI inthehands@hachyderm.io

                                        Going with meta noindex for now. My thinking is that this actively tells Google to yank already-crawled content from their index, whereas they might take a robots.txt entry to mean “do not update, but keep showing last fetched.”

                                        shadowjonathan@tech.lgbtS This user is from outside of this forum
                                        shadowjonathan@tech.lgbtS This user is from outside of this forum
                                        shadowjonathan@tech.lgbt
                                        wrote last edited by
                                        #20

                                        @inthehands this is a fence-post defense against this, google Will Not Care

                                        just start poisoning the data once you detect that google is the one fetching it, just absolutely fucking destroy their LLM output

                                        114@tech.lgbt1 wsslmn@mastodon.nlW 2 Replies Last reply
                                        0
                                        • inthehands@hachyderm.ioI inthehands@hachyderm.io

                                          Going with meta noindex for now. My thinking is that this actively tells Google to yank already-crawled content from their index, whereas they might take a robots.txt entry to mean “do not update, but keep showing last fetched.”

                                          lunaphied@provably.onlineL This user is from outside of this forum
                                          lunaphied@provably.onlineL This user is from outside of this forum
                                          lunaphied@provably.online
                                          wrote last edited by
                                          #21

                                          @inthehands also probably worth it to submit a pagemaster/webmaster request to them to directly tell them to deindex your site. Also DMCA takedowns to Google are usually effective. If you're in the jurisdiction of Australia you're potentially able to go after them iirc. (The Australian government went after them for embedding news articles in their output or something)

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups