Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. To keep #OpenStreetMap.org up and running while we're being deluged by scrapers, we've blocked 320,000+ primarily residential IPv4 addresses in the last 24 hours (+ 100,000 IPv6) involved in scraping.

To keep #OpenStreetMap.org up and running while we're being deluged by scrapers, we've blocked 320,000+ primarily residential IPv4 addresses in the last 24 hours (+ 100,000 IPv6) involved in scraping.

Scheduled Pinned Locked Moved Uncategorized
openstreetmapbotsabuse
50 Posts 28 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • osm_tech@en.osm.townO osm_tech@en.osm.town

    To keep #OpenStreetMap.org up and running while we're being deluged by scrapers, we've blocked 320,000+ primarily residential IPv4 addresses in the last 24 hours (+ 100,000 IPv6) involved in scraping.

    If you need OSM data, please don't scrape the website - use the official downloads at https://planet.openstreetmap.org
    πŸ™πŸŒ #AI #Bots #Abuse

    hunterz@mastodon.sdf.orgH This user is from outside of this forum
    hunterz@mastodon.sdf.orgH This user is from outside of this forum
    hunterz@mastodon.sdf.org
    wrote last edited by
    #7

    @osm_tech does coming from residential IPs mean that someone has baked a scraper into some popular tool that people don't realize is doing that?

    ryanprior@mastodon.socialR jay0@alico.nexusJ marcel@waldvogel.familyM 3 Replies Last reply
    0
    • hunterz@mastodon.sdf.orgH hunterz@mastodon.sdf.org

      @osm_tech does coming from residential IPs mean that someone has baked a scraper into some popular tool that people don't realize is doing that?

      ryanprior@mastodon.socialR This user is from outside of this forum
      ryanprior@mastodon.socialR This user is from outside of this forum
      ryanprior@mastodon.social
      wrote last edited by
      #8

      @HunterZ @osm_tech this is actually quite common. Mobile advertising SDKs for games, background apps, etc include residential scraping proxy functionality that they can sell to the highest bidder, and then when scrapers want to avoid restrictions they can pay a fraction of a penny to send their requests via your phone. Millions of people use apps with this built in and have no idea. Most websites don't want to ban the residential scrapers because it can hurt growth.

      tehstu@hachyderm.ioT olbohlen@norden.socialO 2 Replies Last reply
      0
      • osm_tech@en.osm.townO osm_tech@en.osm.town

        To keep #OpenStreetMap.org up and running while we're being deluged by scrapers, we've blocked 320,000+ primarily residential IPv4 addresses in the last 24 hours (+ 100,000 IPv6) involved in scraping.

        If you need OSM data, please don't scrape the website - use the official downloads at https://planet.openstreetmap.org
        πŸ™πŸŒ #AI #Bots #Abuse

        utf_7@mastodon.socialU This user is from outside of this forum
        utf_7@mastodon.socialU This user is from outside of this forum
        utf_7@mastodon.social
        wrote last edited by
        #9

        @osm_tech how can you even scrape a mapsite? isnt osm just a big canvas like google maps that is also nearly impossible to automate?

        osm_tech@en.osm.townO 1 Reply Last reply
        0
        • utf_7@mastodon.socialU utf_7@mastodon.social

          @osm_tech how can you even scrape a mapsite? isnt osm just a big canvas like google maps that is also nearly impossible to automate?

          osm_tech@en.osm.townO This user is from outside of this forum
          osm_tech@en.osm.townO This user is from outside of this forum
          osm_tech@en.osm.town
          wrote last edited by
          #10

          @utf_7 It is madness, start here: https://www.openstreetmap.org/node/1 and keep going once you reach https://www.openstreetmap.org/node/10000000000, then start on ways, and relations πŸ˜› or just download the latest weekly export from planet.openstreetmap.org 😏

          utf_7@mastodon.socialU felixcremer@fediscience.orgF 2 Replies Last reply
          0
          • osm_tech@en.osm.townO osm_tech@en.osm.town

            @utf_7 It is madness, start here: https://www.openstreetmap.org/node/1 and keep going once you reach https://www.openstreetmap.org/node/10000000000, then start on ways, and relations πŸ˜› or just download the latest weekly export from planet.openstreetmap.org 😏

            utf_7@mastodon.socialU This user is from outside of this forum
            utf_7@mastodon.socialU This user is from outside of this forum
            utf_7@mastodon.social
            wrote last edited by
            #11

            @osm_tech uff, i am a noob so forgive my stupid question: cant you somehow limit the requests. like 10 requests per minute or so. normal users will not be affected and scrapers will take forever?

            osm_tech@en.osm.townO 1 Reply Last reply
            0
            • ryanprior@mastodon.socialR ryanprior@mastodon.social

              @HunterZ @osm_tech this is actually quite common. Mobile advertising SDKs for games, background apps, etc include residential scraping proxy functionality that they can sell to the highest bidder, and then when scrapers want to avoid restrictions they can pay a fraction of a penny to send their requests via your phone. Millions of people use apps with this built in and have no idea. Most websites don't want to ban the residential scrapers because it can hurt growth.

              tehstu@hachyderm.ioT This user is from outside of this forum
              tehstu@hachyderm.ioT This user is from outside of this forum
              tehstu@hachyderm.io
              wrote last edited by
              #12

              @ryanprior @HunterZ @osm_tech I had no idea this was a thing. And presumably, as requests come from you, not the advertiser, Pihole (and other network blockers) treat it as legitimate traffic?

              ryanprior@mastodon.socialR hunterz@mastodon.sdf.orgH 2 Replies Last reply
              0
              • utf_7@mastodon.socialU utf_7@mastodon.social

                @osm_tech uff, i am a noob so forgive my stupid question: cant you somehow limit the requests. like 10 requests per minute or so. normal users will not be affected and scrapers will take forever?

                osm_tech@en.osm.townO This user is from outside of this forum
                osm_tech@en.osm.townO This user is from outside of this forum
                osm_tech@en.osm.town
                wrote last edited by
                #13

                @utf_7 We've had 400,000 IPs in the last 24 hours. Each IP only does a few requests. Technically we're managing, but no fun fighting this daily rather than building new things.

                utf_7@mastodon.socialU 1 Reply Last reply
                0
                • tehstu@hachyderm.ioT tehstu@hachyderm.io

                  @ryanprior @HunterZ @osm_tech I had no idea this was a thing. And presumably, as requests come from you, not the advertiser, Pihole (and other network blockers) treat it as legitimate traffic?

                  ryanprior@mastodon.socialR This user is from outside of this forum
                  ryanprior@mastodon.socialR This user is from outside of this forum
                  ryanprior@mastodon.social
                  wrote last edited by
                  #14

                  @tehstu @HunterZ @osm_tech anything your pihole would let you request, it'd let the scraper request. If the scraper wanted to scrape some ads from another network it might get blocked, I guess.

                  1 Reply Last reply
                  0
                  • tehstu@hachyderm.ioT tehstu@hachyderm.io

                    @ryanprior @HunterZ @osm_tech I had no idea this was a thing. And presumably, as requests come from you, not the advertiser, Pihole (and other network blockers) treat it as legitimate traffic?

                    hunterz@mastodon.sdf.orgH This user is from outside of this forum
                    hunterz@mastodon.sdf.orgH This user is from outside of this forum
                    hunterz@mastodon.sdf.org
                    wrote last edited by
                    #15

                    @tehstu @ryanprior @osm_tech pihole works by refusing to provide DNS resolution for domains on its blocklists, so it could block a scraper *if* its functionality depends on resolving a domain name that is blocked by pihole.

                    hunterz@mastodon.sdf.orgH 1 Reply Last reply
                    0
                    • hunterz@mastodon.sdf.orgH hunterz@mastodon.sdf.org

                      @tehstu @ryanprior @osm_tech pihole works by refusing to provide DNS resolution for domains on its blocklists, so it could block a scraper *if* its functionality depends on resolving a domain name that is blocked by pihole.

                      hunterz@mastodon.sdf.orgH This user is from outside of this forum
                      hunterz@mastodon.sdf.orgH This user is from outside of this forum
                      hunterz@mastodon.sdf.org
                      wrote last edited by
                      #16

                      @tehstu @ryanprior @osm_tech oh and of course the scraper would have to respect pihole versus using its own hard coded DNS IP to resolve things.

                      1 Reply Last reply
                      0
                      • osm_tech@en.osm.townO osm_tech@en.osm.town

                        @utf_7 We've had 400,000 IPs in the last 24 hours. Each IP only does a few requests. Technically we're managing, but no fun fighting this daily rather than building new things.

                        utf_7@mastodon.socialU This user is from outside of this forum
                        utf_7@mastodon.socialU This user is from outside of this forum
                        utf_7@mastodon.social
                        wrote last edited by
                        #17

                        @osm_tech tHeN yOu jUsT neEd tO sCaLe

                        osm_tech@en.osm.townO 1 Reply Last reply
                        0
                        • osm_tech@en.osm.townO osm_tech@en.osm.town

                          To keep #OpenStreetMap.org up and running while we're being deluged by scrapers, we've blocked 320,000+ primarily residential IPv4 addresses in the last 24 hours (+ 100,000 IPv6) involved in scraping.

                          If you need OSM data, please don't scrape the website - use the official downloads at https://planet.openstreetmap.org
                          πŸ™πŸŒ #AI #Bots #Abuse

                          jonsaenzagirre@mastodon.eusJ This user is from outside of this forum
                          jonsaenzagirre@mastodon.eusJ This user is from outside of this forum
                          jonsaenzagirre@mastodon.eus
                          wrote last edited by
                          #18

                          @osm_tech question. Why do people scrape server which make the data freely available? And, probably, better structured in the final product. I don't see the point.

                          osm_tech@en.osm.townO dnub@mastodon.socialD vampirdaddy@chaos.socialV 3 Replies Last reply
                          0
                          • jonsaenzagirre@mastodon.eusJ jonsaenzagirre@mastodon.eus

                            @osm_tech question. Why do people scrape server which make the data freely available? And, probably, better structured in the final product. I don't see the point.

                            osm_tech@en.osm.townO This user is from outside of this forum
                            osm_tech@en.osm.townO This user is from outside of this forum
                            osm_tech@en.osm.town
                            wrote last edited by
                            #19

                            @JonSaenzAgirre It is a good questions, and we don't know the answer either. Our planet data is so much easier to process and use.

                            ff7@freiburg.socialF 1 Reply Last reply
                            0
                            • utf_7@mastodon.socialU utf_7@mastodon.social

                              @osm_tech tHeN yOu jUsT neEd tO sCaLe

                              osm_tech@en.osm.townO This user is from outside of this forum
                              osm_tech@en.osm.townO This user is from outside of this forum
                              osm_tech@en.osm.town
                              wrote last edited by
                              #20

                              @utf_7 In this economy with RAM prices what they are?!? πŸ˜‰

                              1 Reply Last reply
                              0
                              • jonsaenzagirre@mastodon.eusJ jonsaenzagirre@mastodon.eus

                                @osm_tech question. Why do people scrape server which make the data freely available? And, probably, better structured in the final product. I don't see the point.

                                dnub@mastodon.socialD This user is from outside of this forum
                                dnub@mastodon.socialD This user is from outside of this forum
                                dnub@mastodon.social
                                wrote last edited by
                                #21

                                @JonSaenzAgirre vibe coders gonna vibe

                                1 Reply Last reply
                                0
                                • hunterz@mastodon.sdf.orgH hunterz@mastodon.sdf.org

                                  @osm_tech does coming from residential IPs mean that someone has baked a scraper into some popular tool that people don't realize is doing that?

                                  jay0@alico.nexusJ This user is from outside of this forum
                                  jay0@alico.nexusJ This user is from outside of this forum
                                  jay0@alico.nexus
                                  wrote last edited by
                                  #22

                                  @HunterZ@mastodon.sdf.org @osm_tech@en.osm.town lots of mobile/desktop apps, browser extensions, and even IoT devices are paid by "residential proxy" companies to prey on their users by selling said users's connections to AI scrapers https://www.spamhaus.org/resource-hub/compromised/lets-talk-about-the-danger-of-residential-proxy-networks/

                                  vampirdaddy@chaos.socialV 1 Reply Last reply
                                  0
                                  • ryanprior@mastodon.socialR ryanprior@mastodon.social

                                    @HunterZ @osm_tech this is actually quite common. Mobile advertising SDKs for games, background apps, etc include residential scraping proxy functionality that they can sell to the highest bidder, and then when scrapers want to avoid restrictions they can pay a fraction of a penny to send their requests via your phone. Millions of people use apps with this built in and have no idea. Most websites don't want to ban the residential scrapers because it can hurt growth.

                                    olbohlen@norden.socialO This user is from outside of this forum
                                    olbohlen@norden.socialO This user is from outside of this forum
                                    olbohlen@norden.social
                                    wrote last edited by
                                    #23

                                    @ryanprior @HunterZ @osm_tech I have that scraping also on my private webserver and it forced me to make a whole bunch of content private. yet still the botnet scrapes onto it and gets 404s now. Every single request from a different IP...

                                    ryanprior@mastodon.socialR 1 Reply Last reply
                                    0
                                    • olbohlen@norden.socialO olbohlen@norden.social

                                      @ryanprior @HunterZ @osm_tech I have that scraping also on my private webserver and it forced me to make a whole bunch of content private. yet still the botnet scrapes onto it and gets 404s now. Every single request from a different IP...

                                      ryanprior@mastodon.socialR This user is from outside of this forum
                                      ryanprior@mastodon.socialR This user is from outside of this forum
                                      ryanprior@mastodon.social
                                      wrote last edited by
                                      #24

                                      @olbohlen @HunterZ @osm_tech sad to hear that! It's wild though, you can sign up for a scraper proxy service in minutes. They're legal, inexpensive, and easy to use. Admins who assume scrapers are using their own machines that inauthentic traffic will come from a few IP addresses are sadly living in the past.

                                      olbohlen@norden.socialO 1 Reply Last reply
                                      0
                                      • ryanprior@mastodon.socialR ryanprior@mastodon.social

                                        @olbohlen @HunterZ @osm_tech sad to hear that! It's wild though, you can sign up for a scraper proxy service in minutes. They're legal, inexpensive, and easy to use. Admins who assume scrapers are using their own machines that inauthentic traffic will come from a few IP addresses are sadly living in the past.

                                        olbohlen@norden.socialO This user is from outside of this forum
                                        olbohlen@norden.socialO This user is from outside of this forum
                                        olbohlen@norden.social
                                        wrote last edited by
                                        #25

                                        @ryanprior @HunterZ @osm_tech sure I could, but I refuse to put my selfhosted stuff behind some new dependency...

                                        ryanprior@mastodon.socialR 1 Reply Last reply
                                        0
                                        • olbohlen@norden.socialO olbohlen@norden.social

                                          @ryanprior @HunterZ @osm_tech sure I could, but I refuse to put my selfhosted stuff behind some new dependency...

                                          ryanprior@mastodon.socialR This user is from outside of this forum
                                          ryanprior@mastodon.socialR This user is from outside of this forum
                                          ryanprior@mastodon.social
                                          wrote last edited by
                                          #26

                                          @olbohlen @HunterZ @osm_tech the complexity of setting up defenses for this is regrettable

                                          jessienab@wetdry.worldJ 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups