Finally (slowly) making progress on my cooperative storage cluster project

The idea is to have a distributed storage system that lets you set up a cooperative storage cluster between multiple semi-trusted parties; with RAID-like striping across participants (so more efficient than duplication) but with all data being encrypted and independently verifiable, so it's resistant to stuff like hacked systems. It's strongly inspired by Tahoe-LAFS (but makes some different design choices for practical reasons).

I got stuck on a seemingly trivial problem; how to make the hashes of different storage objects verifiable while also allowing retrieval of a file when the only thing you have is its decryption key

The problem was that you can't have something addressed by both its content hash *and* a hash of its decryption key (for a decryption process that's several lookup steps removed from the initial one).

Or well, I *thought* you can't, but it turns out that with a small design change, you can in fact do that

Follow

Anyway the idea here is to have a distributed storage system that can be run on spare mismatched storage space by a group of friends, or a group of sysadmins, or whoever else, and be resistant against compromise and censorship. Both for personal data storage but definitely also for things that are meant for public access.

Which seems to have rapidly gotten a lot more relevant since I started on this project...

· · Web · 2 · 1 · 4

(If all goes as expected, it should also be a neat way for different fedi instances to share the storage cost of static media among each other, instead of everyone needing to keep full copies, because it's a deduplicated storage system)

@joepie91 I always wanted to have (or to make) something like that. "I have 100 GB to spare, let me store 80 GB of my data distributed with 20 GB of redundancies to spare"

@joepie91 what is the difference between this and (my current favorite object storage system) Garage?

garagehq.deuxfleurs.fr/

@jonah Mainly that Garage does not have strong resilience against malicious participants; it's a high-trust system towards all participants (or at least it was when I last evaluated it).

That's fine for a lot of cases, but as conditions worsen, it becomes more important to have such resiliency, and it's not something you can trivially patch into the design after the fact, unfortunately.

@jonah ("Malicious participants" here does not just include people who seek to join the cluster with malicious intentions, but also people with honest intentions who are compromised through their system getting hacked, stolen by cops, infiltrated, and so on)

@joepie91 yeah, fair, I only use it on my own machines, but I know they designed it for a collective of people. Definitely still a high-trust situation either way though, so a solution for this sounds super cool (probably not for me specifically, but in general) 👀

Sign in to participate in the conversation
Pixietown

Small server part of the pixie.town infrastructure. Registration is closed.