Hey #fediadmin.
-
Hey #fediadmin.
Is any one having huge influx of bots scraping
/tags/*from Tencent (AS132203), VNPT (AS45899) and some others?They also scraping media assets, but rps is way lower
Okay, after adding AS to http logs I can __ 205__ distinct AS numbers in last 30 minutes requesting
/tags/*What the actual fuck?! No, seriously, has anyone seen that? I would suspect fucking AI scrappers, but they only hammering tags endpoint.
-
System shared this topic
-
Okay, after adding AS to http logs I can __ 205__ distinct AS numbers in last 30 minutes requesting
/tags/*What the actual fuck?! No, seriously, has anyone seen that? I would suspect fucking AI scrappers, but they only hammering tags endpoint.
Oh, great #crowdsec victoria logs integration doesn’t support working with metadata keys.
Seems like it can only parse _msg field, why
-
Hey #fediadmin.
Is any one having huge influx of bots scraping
/tags/*from Tencent (AS132203), VNPT (AS45899) and some others?They also scraping media assets, but rps is way lower
@alex Yes, it's been going on since the end of last year at least.
If I remember correctly, we disabled tags search for unauthenticated users, and then the next thing that was hit was trends. Unfortunately there's no separation between authenticated and unauthenticated clients in the controls for those...
-
@alex Yes, it's been going on since the end of last year at least.
If I remember correctly, we disabled tags search for unauthenticated users, and then the next thing that was hit was trends. Unfortunately there's no separation between authenticated and unauthenticated clients in the controls for those...
@alex See this post for example: https://mastodon.infra.de/@galaxis/115805367424016000
-
@alex See this post for example: https://mastodon.infra.de/@galaxis/115805367424016000
@alex ...unfortunately it seems like most Mastodon admins don't talk about Mastodon administration much on the Fediverse. I'm aware of a couple of Matrix groups, but other than that I don't know where people are discussing operational details...
-
@alex See this post for example: https://mastodon.infra.de/@galaxis/115805367424016000
@galaxis oh, yeas, this is exactly like this!
By any chance do you remember how to switch federated timeline off for non-authenticated users? I don’t see it in env variable and can’t find anything similar in admin UI.
Thanks!
-
@galaxis oh, yeas, this is exactly like this!
By any chance do you remember how to switch federated timeline off for non-authenticated users? I don’t see it in env variable and can’t find anything similar in admin UI.
Thanks!
@alex There's four dropdowns with options in Administration -> Server Settings -> Discovery under the "Public timelines" - header.
-
@alex There's four dropdowns with options in Administration -> Server Settings -> Discovery under the "Public timelines" - header.
@galaxis found it! Thank you!
I wonder how long it’s going to take for bots to stop scraping images now.
-
R relay@relay.infosec.exchange shared this topic
-
@galaxis found it! Thank you!
I wonder how long it’s going to take for bots to stop scraping images now.
@alex Images are a different problem - it seemed to me (I have not done any deep analysis) that these scrapers act as full user agents, and retrieve posts with all media attachments.
Unfortunately as they progress into older posts or long threads, this causes Mastodon to re-fetch old media. We were plain running out of space, until I dropped in an additional patch that severely rate-limits the media proxy for unauthenticated users, see this other thread: https://mastodon.infra.de/@galaxis/116077343266969640
-
@alex Images are a different problem - it seemed to me (I have not done any deep analysis) that these scrapers act as full user agents, and retrieve posts with all media attachments.
Unfortunately as they progress into older posts or long threads, this causes Mastodon to re-fetch old media. We were plain running out of space, until I dropped in an additional patch that severely rate-limits the media proxy for unauthenticated users, see this other thread: https://mastodon.infra.de/@galaxis/116077343266969640
@alex ...leaving Trends open unfortunately provides enough fodder for them. On the public instance I help running, users were complaining when we disabled trends though, so adding protection to the media proxy was the easiest way to stop that scraping. On my personal instance, I disabled Trends, so there's nothing left to scrape except my own public posts.
-
@alex ...leaving Trends open unfortunately provides enough fodder for them. On the public instance I help running, users were complaining when we disabled trends though, so adding protection to the media proxy was the easiest way to stop that scraping. On my personal instance, I disabled Trends, so there's nothing left to scrape except my own public posts.
@galaxis uh, okay, trending page is gone as well.
Weird that tags page is still showing basic tag info regardless of the settings
So sad that the only way to stop ruining web is to gatekeep everything