re: Searchtodon meta, scraping related
His stated goal is to run this as an 'experiment' to 'have this conversation', but in my opinion that could've happened (and was already happening) without publishing a tool, or at the very least making people explicitly **opt-in** to indexing of their toots like this
Searchtodon meta, scraping related
I mentioned these concerns in the announcement thread but wanted to reiterate them here separately. https://social.pixie.town/@f0x/109677581893916570
It does a lot of things right, and advertises itself as built with privacy and consent in mind.
However, while a user's search results are limited to content they could've otherwise seen pass by in their home timeline, all these toots are stored and indexed on the central Searchtodon server, indefinitely.
This means he technically has access to the combined timelines of all the users, and unlike public content scrapers **also followers-only and even DM posts** sent by **any user a Searchtodon user is following**.
There's only an opt-*out* mechanism based on setting your profile to be non-search-engine-indexible, or including a few specific hashtags.
Without opting out though **all your toots** will be stored if *any* of your followers use this tool.
While this for now remains just a technical possibility, with him stating he has no intent of misusing it, there is no way to guarantee this now or in the future, or when this data changes hands (sold off or hacked).
A services like this could have merit, but should absolutely be hosted by yourself or your own instance, since it already has control over all this data, meaning there's no extra party to trust.
@janl@chaos.social Mastodon is larger than Eugen though, and there's actually a sizeable part of the community that also doesn't agree with other (privacy-related) choices Eugen makes, like the Mastodon 4.x hugely exposed client API.
It's good to have a conversation about this with the community, but that goes much better when there isn't a published tool that is (perceived as) an active threat.
Opt-out is not enough, this needs an active opt-in, like a recognizable hashtag in bio.
@janl@chaos.social Right, I get that you currently have no interest to do so, and I'm inclined to believe you, but this is what it gives you access to, now or in the indefinite future.
Thing is though, Mastodon already has Elasticsearch support, and it's an explicit choice to scope the search functionality to just your own toots, hashtags and posts you've actually interacted with, not just the ones you could've seen passing by.
It's like this because of social reasons, not technical issues that need solving, and that's what all the scraper- and adjacent projects seem to get wrong.
@janl@chaos.social I think you have mostly the right idea, but because this isn't self-hostable, it does effectively give your single server a very large amount of data as a combination from all the user's home timelines.
There's no way to verify or guarantee that isn't being used for other purposes (now or in the future). Worse than public-scrapers even, this also gives you access to followers-only and dm content (and the account, through the wider permissions Elk requests)
@tastytea@very.tastytea.de it's not really a crawler, and they seem to have mostly the right idea, but due to it being a single centralized service, with them storing all the content indefinitely, it does effectively give *them* access to a much wider view of the fediverse, even if it's properly scoped for users to all the content they could've seen pass by in their timeline
@violet_cybrespace shit i gotta pay late fees again
also more hyping myself up like: there's a reason i'm refactoring this code, but the quality ain't all that bad. i can still understand it after a few months :")
re: Instance block recommendation
also lmao at the anarchy symbol as the favicon, you really don't get it, do you
Instance block recommendation
#FediBlock impeccable.social, single-user Pleroma/Soapbox instance, feed is full of boosting other shitty instances like noagendasocial.com and gleasonator.com
found them through a reply recommending Soapbox (eww) and Wildebeest (cloudflare eww) software
@Roger okay way to call me, personally, out like that
⚠️ READ BEFORE FOLLOWING ⚠️
if i don't know you from elsewhere (under same nick), shoot me an introductory DM first (following back is fine)
I do anarchist tech stuff and run free services at https://pixie.town
I program, solder rgb led thingies, and fly fpv quadcopters
en: they/them
nl: die/dies (langzaldieleven.nl)
“i don't trust like that”
not a furry, actually
Extreme coffee-out-of-a-wineglass Energy
something something trans list stop scraping bios
and now a word from our sponsors (screenreader warning it's zalgo)
T̀ͧ̓̑͐̓̍̂̏҉̴̷͚̦̤͙̜̖͙̝͟ợ̵͈̗̮̲̥͕̼̩̭̞̙͉̆ͮͧ̉̒́̑̍̋ͭ̌ͭ̒̉́̕͟ ̐̅̈́ͯ҉̸̴҉̹̟͕̖̠̟̤͕į̸̙̮͓̤̠̘̫̦̥̣̻͚̣̎ͭͯ̋̉͝n̔̄̏̈́̃̇͛̂̋̇̐́͘͝҉͙͔̠͇̖̤̹̭̱̪v̴̴̛̘̠̰̹͚̱͉̳̘̥̞̳̪͈ͥͭ̅ͥͦ̀͛̔̃̃̎͋̋̎͐͌ͪ̚͟͢ͅö́́̎ͬ̔͑̆̃̅̒̿ͪͯ̓͏̞̱̜͍̬̗̹̫̝̪͓͕̳̬̰͘͝kͥ̒ͣͦ̌͛̃͒̀̿ͣͪͤͬ̍ͮ̚̚̕͝҉̹̰̟̰̻̻͍̠̗̳̬̬̬̞̟̹̩͇́͜ẹ̴̡̨̱̹͍̯̱̗̗͍̬̐ͣ̑͑̐̓̈̑ͥ̅́̇̃͒̀̃̂́ ̨̛͖̬͇̣͔̼̥̬̝̥̣̭̝̪͎͈̌̅͆̉̀͘͜ͅẗ́̄͊̌̍̆́̿́̊ͣͮ̅ͥͩ̔̏͏̧̳͎̥͈ͅh̴̴͇̻ͧ̍̐̈͐̎͛́̀̽̃̒̔͢͢ȩ̸̶̶̟̗̮̺̭̥͕̭͎̺̙͎̖͔ͪ̑͛̓̅ͪ̄́ͧ͡ͅ ̡̧͇̤͚̻̬͉͔̥̫̟̙ͮͩ͌̿́̆͋͜h̵̨̭̰͎̭̱͊͒́͒͆̎ͮ̈́̆ͪͧ̚͞î̛̦̞͓͖̭͈̮͔̩͙̱̖̞̳̥̦̩ͭ̂̏͒ͨ̃̿̽̓͑ͫ̕͝͡vͧ͋ͪ̌̂̑́͌̂̒͑ͮ̋̂ͫ̈́҉̹͜͢ȩ̡̖̯̞̺̭̗͔͇̻̤̼͈̙̞͉͙̈ͤ͊ͨ̀̆͆͒̓̄̿ͭ̃̚͜͝͡-̶̪̪̠̝̜̯̜̹̭̯͎͍̲̱͉ͪ̏͒̊ͫ̀̈͘͡m̸̪̘͙̰͚̗̳͕̟̖̿̌͐̔̐̈̽̃ͯ̅͢ͅͅi̸̷̧̛͍̝̦̫̮̤̐͑͗̏ͬn̡̨͆ͩͤͫ̔̈́̈́͊͐̂͛̀̚͞҉̜͍̝̰̱͚̜̹̞̝̞͈d̢̫͕͚͕̥̰̝͆͗́ͨ͑̈́̓͜ ̡̩̜͎̳͎͂̓ͫͭ͐̀͡ȑ̷ͭ̑ͪͭ͋͢͏͕̳̟͜ͅͅe̴͌̅ͣ̾͒̔́̊̔ͭ̅̄̇͏͎͉͈̤̙p̀ͥ̈ͨͩ͛ͥͣ͗̄̈́̚҉̢͔͉͍̹̮͉̺r̵̸̡̩͎̱̟̺̟̞͈̯̯̪̹͂́ͣ̐͑̒̒̀ͧͩ̿ͮ̕͞ě̵̡̱͈̜̯̳͍̝̦̜̫͈̜̗̘̪̪̓͆͑͋ͮͯͪ̅̂͐̔̆̃ͫ͑̾͒͢ͅş̶͓͉͚̜̪̜͓̘̻̃̔ͨ́̀ͅẻ̵͇͈̮̝̠͖͍̫͉͓̪̠͔̬͕͛̊͐̎̓̽ͫ̌ͧ̅̿́͘n̛͚̺͈͍̰͉͙̤̘̺͖͉̤͖̈͑͑̍̅ͪ̎͂́ͦ̒ͣ̋̆̄̄̍̃̊͟t̵̛͙͚̥͇̫̻̞͖͕̰͈̩̰̱͉ͣ̃ͫ̋̍̈ͥ͗̎ͭ͋͜i̵̡̤͇̣̰̦̟̭̮̩̲͔̭̟̖̹̙ͥ̆̋ͫ̓͌̒̾̍̄̾̎̂͂̏̇ͩ̚͢n̶̮̹̤̻͈̙͔͎̦̟ͫ̀͌͛̋̌̽̀̓̂̕g̷̣͖̠̩͈̲̥͍̦̘̺̏̍͛͋̎͛͒ͪ̇ͮ͠͝ ͦ͂́̿͐̅̌̊̌̉̍̀҉҉͈͖̮̩͎̮̬͖c͖̬̠̫̠̫̗̉̾͋͒̏̄̈́ͬ̊̓͘͝h̴̷̨͉͖̱̗̪̣͕̮͓͕̺͖͈͙̥̬͓̟ͣ̏̀͐̀́̍ͪ̋͒͐ͪ͐́̕a͍͈͉͎̥̠͍͛ͭ͛̃ͫ͒͋́͟ö͙̻͔̙͖̰́̋̑́͜s̶̸̫̖̫͇̣̻̺̹͔ͧ͐̂̈́ͮ͋̌͠.̰̯̞͎̗̺̠͔̫͍̖ͮͦ̒̏̈̾ͭͧ̉͘͢͠