Follow

So one thing I'm noticing with my search engine crawler is that the vast majority of robots.txt rejects come from... platforms run by Twitter and Facebook.

Not personal sites. Not Mastodon instances. Nope, it's primarily Twitter and Facebook who blanket-refuse access to a new search engine crawler.

· · Web · 0 · 0 · 1
Sign in to participate in the conversation
Pixietown

Small server part of the pixie.town infrastructure. Registration is closed.