I wonder if something like Tahoe-LAFS (in convergent/deduplicated mode) or even Garage wouldn't significantly improve the storage requirements for small fediverse instances

The idea being to maintain a collective pool of media storage (Tahoe requires less mutual trust here) that everyone shares, so that the media get deduplicated across instances, but it still ensures enough redundant copies across instances that outages don't break everyone

@joepie91 this would be super interesting. I guess with garage each site could point the dns to their own node and let it route it to whoever has the copy. You can tag location and iirc it's ok about considering latency.

Trust is probably the biggest thing since once you go beyond 3 nodes you won't have a full copy locally and won't be able to guarantee it is available if enough remote nodes go down or lose their data.

@rune I haven't kept up-to-date with the current model of Garage, but in Tahoe-LAFS it's basically RAID-over-the-network, and every client can verify the existence of enough shares and regenerate and reupload the missing shares if some file is not 'healthy' enough, as the mechanism to prevent data loss (so that minimal trust is required for availability)

· · Web · 1 · 0 · 0

@joepie91 garage would offer you 3 replicas, which should be plenty, but it does sound like tahoe-lafs is more geared towards untrusted setups.

But garage could definitely work for groups of trusted instances.

@rune Yeah, that was more or less my conclusion when I last evaluated Garage; that it's only really suitable for high-trust groups.

Tahoe doesn't have a set amount of replicas; you set the amount of total shares and the amount of needed shares for recovery, as an attribute of the upload. These settings do need to be the same for everyone for dedupe to work, though.

(Storage overhead is basically totalShares divided by neededShares)

Sign in to participate in the conversation
Pixietown

Small server part of the pixie.town infrastructure. Registration is closed.