#AskFedi: the usual explanation for why #IPFS splits files into small-ish chunks is that it's more efficient because it can deduplicate across files.
But is there any actual data on what the *real-world* efficiency gains are in real-world usage of IPFS? Does this actually make a meaningful difference in practice?
The more I try to look into the question of "why does IPFS chunk files", the less convinced I am of its usefulness: https://github.com/ipfs/notes/issues/300 (but maybe I'm missing something?)
@joepie91 uneducated guess seems mostly to come out of it's torrent origins? where you prefer to have small-ish chunks to verify and invalidate quickly so you have to redownload the least amount possible
@eater Yeah but you can do that with range requests, and that saves you a lot of DHT entries
@joepie91 oh god, all chunks have their own hash?
@joepie91 the whole points of torrent hashes was that it's a hash of the set of hashes of all chunks (bit of an abridged version)
@joepie91 Append-only logs / data sets?
(Note that I am only asking about deduplication here, not about parallelization of downloads and such; because there are other ways to achieve that)