The bright #LLM future, next part.

swift@merveilles.town

@kigelia @mgorny that is certainly a problem that is occurring, yes. Since code generation is a primary use case, public code stores are an example of sites that are getting hit badly, which in turn is making open source less viable (alongside junk "fixes" and other noise)

gunstick@mastodon.opencloud.lu

@mgorny I blocked it all based on user agent. They use a useragent which is quite rare. So I blocked that.

leah@chaos.social

@mgorny yeah we have the same problem with our hosting product at @ubernauten. It's sucks to spend so much time working to mitigate the bad behavior of others. Sadly we have no solution either.

bgnfu7re@lviv.social

@js @mgorny I would definitely hit that, just to see what happens definitely would also curl afterwards

machocam@mastodon.social

@ericalaeta @mgorny

Everyone will have to introduce APIs that can accept much more traffic.

mgorny@social.treehouse.systems

@kigelia, yep. We're hosting a few huge repos (and a lot of small ones), so the load caused by crawling everything randomly (including stuff such as commit histories filtered by individual files, git blames and other stuff that's entirely redundant) prevents real people from using the service.

js@ap.nil.im

@mgorny I guess then you can be glad your cat only crashed your browser and didn’t delete your home.

js@ap.nil.im

@saxnot @mgorny 100 GB billion laughs attack compressed to 80 KB.

naturemc@mastodon.online

@algernon @mgorny

mirabilos@toot.mirbsd.org

@mgorny @js Anubis is LLM slop…

mirabilos@toot.mirbsd.org

@mgorny @phf that line does not do what you think it does, in sh…

kyebr@hachyderm.io

@algernon @mgorny I’m guessing that most of the traffic for git.gentoo.org would be by tools on behalf of users. And not through browsers.

Anyway, great article. I’m thinking about implementing the same on my site.

algernon@come-from.mad-scientist.club

@Kyebr Yes, and that's fine: tools that don't try to pretend to be browsers get through the Three Ifs (and iocaine) just fine.

@mgorny

villares@ciberlandia.pt

@davidgerard would you have pointers to "how to guides" for less savvy people? I have a shared hosting account on a web hosting service, I feel like I need to protect myself from these bots and I'm totally lost.

davidgerard@circumstances.run

@villares no, but I went to https://iocaine.madhouse-project.org/ and faffed about a bit. I used iocaine 3 out of the box. i use nginx so i had to figure out the correct config. i added exceptions for some specific user-agents I wanted to let through.

villares@ciberlandia.pt

@davidgerard thank you!

js@ap.nil.im

@mirabilos @mgorny Wat? It’s stopping LLMs.

phf@dmv.community

@mirabilos It does not? Sure deleted a lot of files when I tried it in a container... Please to "edumacate" me? Or do you refer to the redirection? @mgorny

mirabilos@toot.mirbsd.org

@js @mgorny it’s also slop.

(And it’s not been stopping LLMs for months now.)

mirabilos@toot.mirbsd.org

@mgorny @phf yes, the eedirection. Can explain more layer if needed, from the laptop.t

CIRCLE WITH A DOT

The bright #LLM future, next part.