Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Brew (#forgejo), as usual, is being overloaded by the scrapers.

Brew (#forgejo), as usual, is being overloaded by the scrapers.

Scheduled Pinned Locked Moved Uncategorized
forgejobsdcafebsdcafeservices
13 Posts 5 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • stefano@mastodon.bsd.cafeS stefano@mastodon.bsd.cafe

    EDIT: done, let me know if you experience problems

    Brew (#forgejo), as usual, is being overloaded by the scrapers.

    I think I'll have to put an Anubis in front of it. I don't love those "blocks", but sometimes you need to.

    #BSDCafe #BSDCafeServices

    pertho@mastodon.bsd.cafeP This user is from outside of this forum
    pertho@mastodon.bsd.cafeP This user is from outside of this forum
    pertho@mastodon.bsd.cafe
    wrote last edited by
    #3

    @stefano Have you tried blocking the IP ranges in Julian's list?

    Julian Oliver (@JulianOliver@mastodon.social)

    I've done the log analysis and the two biggest contributors that brought the AI crawler hits up to 2 million in a day, a 4x increase on a week prior, are ByteSpider (Singapore networks) and especially AppleBot (used for Siri and other Apple products). The parasites.txt is now >4500 lines long: https://scienceispoetry.net/files/parasites.txt

    favicon

    Mastodon (mastodon.social)

    Or is it all coming from residential proxies? 🤔

    stefano@mastodon.bsd.cafeS oxy@social.bsdlab.auO 2 Replies Last reply
    0
    • pertho@mastodon.bsd.cafeP pertho@mastodon.bsd.cafe

      @stefano Have you tried blocking the IP ranges in Julian's list?

      Julian Oliver (@JulianOliver@mastodon.social)

      I've done the log analysis and the two biggest contributors that brought the AI crawler hits up to 2 million in a day, a 4x increase on a week prior, are ByteSpider (Singapore networks) and especially AppleBot (used for Siri and other Apple products). The parasites.txt is now >4500 lines long: https://scienceispoetry.net/files/parasites.txt

      favicon

      Mastodon (mastodon.social)

      Or is it all coming from residential proxies? 🤔

      stefano@mastodon.bsd.cafeS This user is from outside of this forum
      stefano@mastodon.bsd.cafeS This user is from outside of this forum
      stefano@mastodon.bsd.cafe
      wrote last edited by
      #4

      @pertho residential proxies. I blocked everything I could block, but it wasn't enough.

      1 Reply Last reply
      0
      • pertho@mastodon.bsd.cafeP pertho@mastodon.bsd.cafe

        @stefano Have you tried blocking the IP ranges in Julian's list?

        Julian Oliver (@JulianOliver@mastodon.social)

        I've done the log analysis and the two biggest contributors that brought the AI crawler hits up to 2 million in a day, a 4x increase on a week prior, are ByteSpider (Singapore networks) and especially AppleBot (used for Siri and other Apple products). The parasites.txt is now >4500 lines long: https://scienceispoetry.net/files/parasites.txt

        favicon

        Mastodon (mastodon.social)

        Or is it all coming from residential proxies? 🤔

        oxy@social.bsdlab.auO This user is from outside of this forum
        oxy@social.bsdlab.auO This user is from outside of this forum
        oxy@social.bsdlab.au
        wrote last edited by
        #5
        @pertho @stefano oooh interesting list.

        I've been tinkering with ssh/httpd logs/awk and enriching the data with https://iplocate.io/ and maybe eventually greynoise and spamhaus (to get more residential proxies etc)
        1 Reply Last reply
        0
        • stefano@mastodon.bsd.cafeS stefano@mastodon.bsd.cafe

          EDIT: done, let me know if you experience problems

          Brew (#forgejo), as usual, is being overloaded by the scrapers.

          I think I'll have to put an Anubis in front of it. I don't love those "blocks", but sometimes you need to.

          #BSDCafe #BSDCafeServices

          tubsta@social.bsdlab.auT This user is from outside of this forum
          tubsta@social.bsdlab.auT This user is from outside of this forum
          tubsta@social.bsdlab.au
          wrote last edited by
          #6
          @stefano Stick Bunny in with origin shield to drop the nuffs scraping your site
          stefano@mastodon.bsd.cafeS 1 Reply Last reply
          0
          • tubsta@social.bsdlab.auT tubsta@social.bsdlab.au
            @stefano Stick Bunny in with origin shield to drop the nuffs scraping your site
            stefano@mastodon.bsd.cafeS This user is from outside of this forum
            stefano@mastodon.bsd.cafeS This user is from outside of this forum
            stefano@mastodon.bsd.cafe
            wrote last edited by
            #7

            @tubsta this would work, but I'm trying to avoid using (external) CDNs, at the moment.

            tubsta@social.bsdlab.auT 1 Reply Last reply
            0
            • stefano@mastodon.bsd.cafeS stefano@mastodon.bsd.cafe

              @tubsta this would work, but I'm trying to avoid using (external) CDNs, at the moment.

              tubsta@social.bsdlab.auT This user is from outside of this forum
              tubsta@social.bsdlab.auT This user is from outside of this forum
              tubsta@social.bsdlab.au
              wrote last edited by
              #8
              @stefano I agree with what you are trying to do as I would rather avoid CDNs but some services need it, just gotta work out the least shit ones and the ones that are Europe focused to assist here.
              tubsta@social.bsdlab.auT stefano@mastodon.bsd.cafeS 2 Replies Last reply
              0
              • tubsta@social.bsdlab.auT tubsta@social.bsdlab.au
                @stefano I agree with what you are trying to do as I would rather avoid CDNs but some services need it, just gotta work out the least shit ones and the ones that are Europe focused to assist here.
                tubsta@social.bsdlab.auT This user is from outside of this forum
                tubsta@social.bsdlab.auT This user is from outside of this forum
                tubsta@social.bsdlab.au
                wrote last edited by
                #9
                @stefano FWIW I spend about $5 a month with Bunny’s CDN products for bsdlab
                mwl@io.mwl.ioM 1 Reply Last reply
                0
                • tubsta@social.bsdlab.auT tubsta@social.bsdlab.au
                  @stefano FWIW I spend about $5 a month with Bunny’s CDN products for bsdlab
                  mwl@io.mwl.ioM This user is from outside of this forum
                  mwl@io.mwl.ioM This user is from outside of this forum
                  mwl@io.mwl.io
                  wrote last edited by
                  #10

                  @tubsta @stefano

                  If you must CDN, Bunny is 100% the way to go. Contains much less suck.

                  I run cdn.mwl.io specifically to distribute files. It's a CDN composed of one host.

                  1 Reply Last reply
                  0
                  • tubsta@social.bsdlab.auT tubsta@social.bsdlab.au
                    @stefano I agree with what you are trying to do as I would rather avoid CDNs but some services need it, just gotta work out the least shit ones and the ones that are Europe focused to assist here.
                    stefano@mastodon.bsd.cafeS This user is from outside of this forum
                    stefano@mastodon.bsd.cafeS This user is from outside of this forum
                    stefano@mastodon.bsd.cafe
                    wrote last edited by
                    #11

                    @tubsta sure. Bunny is great. I have an account and use it for some services. For some time, some of the BSD Cafe contents were served by them, and it was perfect.

                    tubsta@social.bsdlab.auT 1 Reply Last reply
                    0
                    • stefano@mastodon.bsd.cafeS stefano@mastodon.bsd.cafe

                      @tubsta sure. Bunny is great. I have an account and use it for some services. For some time, some of the BSD Cafe contents were served by them, and it was perfect.

                      tubsta@social.bsdlab.auT This user is from outside of this forum
                      tubsta@social.bsdlab.auT This user is from outside of this forum
                      tubsta@social.bsdlab.au
                      wrote last edited by
                      #12
                      @stefano They now have S3 object access for their storage nodes which has been a long time coming
                      stefano@mastodon.bsd.cafeS 1 Reply Last reply
                      0
                      • tubsta@social.bsdlab.auT tubsta@social.bsdlab.au
                        @stefano They now have S3 object access for their storage nodes which has been a long time coming
                        stefano@mastodon.bsd.cafeS This user is from outside of this forum
                        stefano@mastodon.bsd.cafeS This user is from outside of this forum
                        stefano@mastodon.bsd.cafe
                        wrote last edited by
                        #13

                        @tubsta oh nice! I was curious to see it implemented. I'll have a look.

                        1 Reply Last reply
                        0
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups