Oh, for fucks sake.
I routed the internet archive's bots (by user agent and by name) to iocaine, because they don't respect /robots.txt. Looks like they're still able to archive at least some of my things.
Time to dig into the logs again, 'cos this won't do.
@joepie91 I am sure, yes. They come from IA's IP range, and they do not respect robots.txt, they stopped doing so in 2017.
Case in point: when I tried to take a capture now:
;> _time:2h request.host:chronicles.mad-scientist.club classification.user_agent:"Internet Archive" | keep request.uri; { "request.uri": "/tales/a-season-on-iocaine/" }
In other words: the only URL IA requested is the one I gave it. It did not even attempt to fetch robots.txt.