**‍fuchsiaaaaaaaaaaaaaaaaa** @f0x@pixie.town · Jun 21, 2023, 18:47

**‍fuchsiaaaaaaaaaaaaaaaaa** @f0x@pixie.town · Jun 21, 2023, 18:47

‍fuchsiaaaaaaaaaaaaaaaaa @f0x@pixie.town

Jun 21, 2023, 18:47

‍fuchsiaaaaaaaaaaaaaaaaa @f0x@pixie.town

Blocklist scraping by fash

So this has been an ongoing issue, would love it if people found the earlier threads about it for more context cause I don't have the spoons right now

Originally written by "mint", hosted on the kiwifarms git is a tool that continuously scrapes publicized instance blocklists to allow searching who has you blocked (resulting in emails like uwu we did nothing wrong how dare you block our instance)

Through correlation, turns out the main IP being used by fba.ryona.agency is `54.37.233.246`. Blocking that at the firewall level prevents them from getting any new data.

Other instances exist too though, being hosted on
`23.24.204.110`, `45.86.70.49`, `88.65.6.124`, `187.190.192.31`

the drow.be / bka.li / teleyal.blog / mooneyed.de "kromonos" user has their own version, that feeds an API that gives your instance a highscore for blocking their shit, scrapes from `185.244.192.119`, with user agents presenting as random instances

These, and other scrapish ip's are also listed in https://git.pixie.town/f0x/nixos/src/branch/main/nodes/aura/configuration.nix#L103

#FediBlock #MastoAdmin

**‍fuchsiaaaaaaaaaaaaaaaaa** @f0x@pixie.town · 2023-06-21T19:34:39Z

‍fuchsiaaaaaaaaaaaaaaaaa @f0x@pixie.town

re: Mitigating blocklist scraping by fash

Quite interesting workaround; the kiwifarms scraper is configured to not follow HTTP redirects, so by adding one you can make them give up, while legit users can still view the page without issues.

https://git.pixie.town/f0x/nixos/src/branch/main/nodes/aura/services/nginx.nix#L202-L215
Adapts my nginx setup to redirect /about/more to /about/much-more

Of course a scraper could go to much-more directly now, but if we all pick something unique, that's impossible to hardcode for. And if they *do* start following redirects, we could introduce honeypot instances that redirect all around the place, disrupting the scrape (which all happens in sequence across domains btw)

Jun 21, 2023, 19:34 · · · ·

Resources

Developers

What is Mastodon?

pixie.town

More…