re: Tools and abuse vectors
@lawremipsum@mspsocial.net While the situation with archive.org specifically is a complex one (there's also a long history of robots.txt being used *by* abusers to evade accountability via the Wayback Machine, and it's not just hypothetical), what surprises me more here is that the Wayback Machine apparently doesn't follow robots.txt?
The last time I checked, they refused to serve up anything behind a robots.txt, even when the robots.txt was added *after* the crawl, and their crawler always clearly identified itself, precisely for abuse/consent reasons. Did this change?
re: Tools and abuse vectors
@joepie91 @lawremipsum@mspsocial.net https://m.noxie.ch/@Ariane/109481280019500332 notes it did not seem to work at all