@internetarchive is there a way to verify whether a crawler with an ArchiveTeam user-agent is actually operating on your behalf?

I am currently using the method described in this GitHub discussion (github.com/internetarchive/her) to detect and ban scrapers that spoof Googlebot and Bingbot UA strings, but it doesn't seem to work for some bots that have crawled my site(s) today.

I would like to allow the Internet Archive to preserve copies of my pages, but without a method to validate their authenticity this will leave a hole that AI scrapers can abuse.

Making a custom fedi instance and then making a post asking for help with a math problem, but federating a different problem to every instance so that people will argue in the replies.

Show older
Pixietown

Small server part of the pixie.town infrastructure. Registration is closed.