Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. To keep #OpenStreetMap.org up and running while we're being deluged by scrapers, we've blocked 320,000+ primarily residential IPv4 addresses in the last 24 hours (+ 100,000 IPv6) involved in scraping.

To keep #OpenStreetMap.org up and running while we're being deluged by scrapers, we've blocked 320,000+ primarily residential IPv4 addresses in the last 24 hours (+ 100,000 IPv6) involved in scraping.

Scheduled Pinned Locked Moved Uncategorized
openstreetmapbotsabuse
50 Posts 28 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • vampirdaddy@chaos.socialV vampirdaddy@chaos.social

    @JonSaenzAgirre @osm_tech
    The scrapers are DUMB.
    They are not curated, have only basic maintenance, are built to gobble up ANYTHING textual they encounter, without respect, mercy or reason.

    Just collect meaningless data.

    That’s the nature of the coveted LLMs: just statistics, no understanding, structure or meaning.

    And greedy crooks in haste to make quick money just grab everything they can.

    The AI bubble needs to pop really soon.

    jonsaenzagirre@mastodon.eusJ This user is from outside of this forum
    jonsaenzagirre@mastodon.eusJ This user is from outside of this forum
    jonsaenzagirre@mastodon.eus
    wrote last edited by
    #38

    @vampirdaddy @osm_tech this seems a reasonable explanation. Quantity of bytes irrespective of sense. Thank you

    1 Reply Last reply
    0
    • osm_tech@en.osm.townO osm_tech@en.osm.town

      @utf_7 It is madness, start here: https://www.openstreetmap.org/node/1 and keep going once you reach https://www.openstreetmap.org/node/10000000000, then start on ways, and relations 😛 or just download the latest weekly export from planet.openstreetmap.org 😏

      felixcremer@fediscience.orgF This user is from outside of this forum
      felixcremer@fediscience.orgF This user is from outside of this forum
      felixcremer@fediscience.org
      wrote last edited by
      #39

      @osm_tech @utf_7 Why is the first node in OSM somewhere in Italy? I would have expected to find it in some random part of London?

      simon@en.osm.townS 1 Reply Last reply
      0
      • osm_tech@en.osm.townO osm_tech@en.osm.town

        To keep #OpenStreetMap.org up and running while we're being deluged by scrapers, we've blocked 320,000+ primarily residential IPv4 addresses in the last 24 hours (+ 100,000 IPv6) involved in scraping.

        If you need OSM data, please don't scrape the website - use the official downloads at https://planet.openstreetmap.org
        🙏🌍 #AI #Bots #Abuse

        ondrejzizka@witter.czO This user is from outside of this forum
        ondrejzizka@witter.czO This user is from outside of this forum
        ondrejzizka@witter.cz
        wrote last edited by
        #40

        @osm_tech 🤦‍♂️

        1 Reply Last reply
        0
        • osm_tech@en.osm.townO osm_tech@en.osm.town

          @michel42 We'd like to share the IP address list, but unfortunately don't think we can due to legal concerns.

          ondrejzizka@witter.czO This user is from outside of this forum
          ondrejzizka@witter.czO This user is from outside of this forum
          ondrejzizka@witter.cz
          wrote last edited by
          #41

          @osm_tech @michel42 Understood.

          Unrelated: Could you please provide me a list of cca 150k random large unsigned integers? I'm testing the xz library and need some test data.

          1 Reply Last reply
          0
          • felixcremer@fediscience.orgF felixcremer@fediscience.org

            @osm_tech @utf_7 Why is the first node in OSM somewhere in Italy? I would have expected to find it in some random part of London?

            simon@en.osm.townS This user is from outside of this forum
            simon@en.osm.townS This user is from outside of this forum
            simon@en.osm.town
            wrote last edited by
            #42

            @felixcremer @utf_7 because you are looking at version 43 of the node which has been subject to redaction (licence change), vandalism, and simply buggy software over 20+ years https://www.openstreetmap.org/node/1/history#map=18/1.999999/2.000000

            felixcremer@fediscience.orgF 1 Reply Last reply
            0
            • simon@en.osm.townS simon@en.osm.town

              @felixcremer @utf_7 because you are looking at version 43 of the node which has been subject to redaction (licence change), vandalism, and simply buggy software over 20+ years https://www.openstreetmap.org/node/1/history#map=18/1.999999/2.000000

              felixcremer@fediscience.orgF This user is from outside of this forum
              felixcremer@fediscience.orgF This user is from outside of this forum
              felixcremer@fediscience.org
              wrote last edited by
              #43

              @simon @utf_7 Thanks, yeah that makes sense.

              simon@en.osm.townS 1 Reply Last reply
              0
              • felixcremer@fediscience.orgF felixcremer@fediscience.org

                @simon @utf_7 Thanks, yeah that makes sense.

                simon@en.osm.townS This user is from outside of this forum
                simon@en.osm.townS This user is from outside of this forum
                simon@en.osm.town
                wrote last edited by
                #44

                @felixcremer @utf_7 I didn't mention this, but should have: prior to OSM API 0.5 (October 2007) objects were not versioned, the original "node 1" was deleted prior to that date and therefore doesn't actually exist in the current OSM data at all. The current "node 1" is a reuse of the old id IIRC.

                utf_7@mastodon.socialU 1 Reply Last reply
                0
                • harry_wood@en.osm.townH This user is from outside of this forum
                  harry_wood@en.osm.townH This user is from outside of this forum
                  harry_wood@en.osm.town
                  wrote last edited by
                  #45

                  @zymurgic The website interface designed for humans is the main issue I believe. See also https://en.osm.town/@osm_tech/115974391032358572
                  So that's... stupid

                  I'm not sure who hosts the main Overpass API instance, but I don't think it is the OpenStreetMap Foundation, so (while they probably do have similar challenges) it's not that we're talking about.

                  1 Reply Last reply
                  0
                  • simon@en.osm.townS simon@en.osm.town

                    @felixcremer @utf_7 I didn't mention this, but should have: prior to OSM API 0.5 (October 2007) objects were not versioned, the original "node 1" was deleted prior to that date and therefore doesn't actually exist in the current OSM data at all. The current "node 1" is a reuse of the old id IIRC.

                    utf_7@mastodon.socialU This user is from outside of this forum
                    utf_7@mastodon.socialU This user is from outside of this forum
                    utf_7@mastodon.social
                    wrote last edited by
                    #46

                    @simon @felixcremer til something about osm nodes. what distance are 2 neighboring nodes away? or does this vary of the resolution of the area. like on the high seas there are more miles away than in Detroit

                    simon@en.osm.townS 1 Reply Last reply
                    0
                    • utf_7@mastodon.socialU utf_7@mastodon.social

                      @simon @felixcremer til something about osm nodes. what distance are 2 neighboring nodes away? or does this vary of the resolution of the area. like on the high seas there are more miles away than in Detroit

                      simon@en.osm.townS This user is from outside of this forum
                      simon@en.osm.townS This user is from outside of this forum
                      simon@en.osm.town
                      wrote last edited by
                      #47

                      @utf_7 @felixcremer the easiest way to calculate this is to use the Haversine distance, see https://en.wikipedia.org/wiki/Haversine_formula

                      Outside of that nodes are placed where they are deemed necessary to replicate the geometry of the objects. Naturally a rendering on a map can smooth that out if the designer wants to (most don't though).

                      1 Reply Last reply
                      0
                      • ryanprior@mastodon.socialR ryanprior@mastodon.social

                        @olbohlen @HunterZ @osm_tech the complexity of setting up defenses for this is regrettable

                        jessienab@wetdry.worldJ This user is from outside of this forum
                        jessienab@wetdry.worldJ This user is from outside of this forum
                        jessienab@wetdry.world
                        wrote last edited by
                        #48

                        @ryanprior @olbohlen @HunterZ easiest is to just block http 1.1 requests for sites being hammered, since 99% of scrape requests I've seen have been with that protocol.

                        1 Reply Last reply
                        0
                        • algernon@come-from.mad-scientist.clubA This user is from outside of this forum
                          algernon@come-from.mad-scientist.clubA This user is from outside of this forum
                          algernon@come-from.mad-scientist.club
                          wrote last edited by
                          #49

                          @arichtman WTF are Mull doing. Chrome, but no sec-ch-ua.

                          I'm not having much luck in finding their Android browser... I'm seeing Mullvad VPN, and the browser in alpha for win/mac/linux, but not android. Can you point me in the right direction?

                          Not going to dive into it now, but I'd like to save it for my records.

                          1 Reply Last reply
                          0
                          • jay0@alico.nexusJ jay0@alico.nexus

                            @HunterZ@mastodon.sdf.org @osm_tech@en.osm.town lots of mobile/desktop apps, browser extensions, and even IoT devices are paid by "residential proxy" companies to prey on their users by selling said users's connections to AI scrapers https://www.spamhaus.org/resource-hub/compromised/lets-talk-about-the-danger-of-residential-proxy-networks/

                            vampirdaddy@chaos.socialV This user is from outside of this forum
                            vampirdaddy@chaos.socialV This user is from outside of this forum
                            vampirdaddy@chaos.social
                            wrote last edited by
                            #50

                            @jay0 @HunterZ @osm_tech

                            Until recently I mainly fought against residential proxys facilitating DDoS- and crawling-attacks.

                            Thus I did not have the access threat to internal systems on my radar.

                            I think that vector is under-reported:

                            Residential (i.e. software- or library-embedded) proxys on smartphones that are allowed into company networks.

                            1 Reply Last reply
                            0
                            • R relay@relay.infosec.exchange shared this topic
                            Reply
                            • Reply as topic
                            Log in to reply
                            • Oldest to Newest
                            • Newest to Oldest
                            • Most Votes


                            • Login

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • World
                            • Users
                            • Groups