The Day I Logged 1 In Every 2000 Public IPv4: Visualizing The AI Scraper DDoS - VulpineCitrus

lemmydividebyzero@reddthat.com · 19 days ago

The Day I Logged 1 In Every 2000 Public IPv4: Visualizing The AI Scraper DDoS - VulpineCitrus

pcouy@lemmy.pierre-couy.fr · 19 days ago

I’ve been in a similar situation, and I’m also blocking large ranges of IP addresses in addition to running Anubis in front of my most scraped services (Git/forgejo and Lemmy)

I came up with a hacky python script that watches my fail2ban logs, counts bans for IP ranges going from /28 to /8, applies some heuristics (based on range size n and how offending IPs are split between the 2 /(n+1) subranges) I came up with to detect ranges that should be blocked, the issues a log line that is picked up by fail2ban to manage bans of increasing length on récidive.

It’s quite contrived and I often fear it will be too agressive and block something I rely on, but it has been working really wellin my experience.

It will initially block a lot of small ranges, but over time the ranges will grow larger. Smaller ranges having a lower threshold helps it block only the narrowest ranges needed, which gives some time for larger ranges that contain them to drop out of fail2ban’s watchlist.

I should clean up this mess and make it a git repo, maybe even try to have it merged in fail2ban

amateurcrastinator@lemmy.world · 19 days ago

I am curious to know more!

thericofactor@sh.itjust.works · 19 days ago

So we’re at the point where A I. Is not only stealing intellectual property, but also driving up costs for people while doing it.

MysticKetchup@lemmy.world · 19 days ago

At this point we need to treat AI web scrapers as DDoS attacks and prosecute the companies and people involved the same way we would those