last time I heard about the server load from mastodon being expensive to operate, I heard it's the act of pushing that puts load on the server and takes it down. Not the act of serving, typically. It sounds like serving the static content is relatively easy cpu-wise. But it sounds like the pushing code is still based on a one-thread-per-request model that thrashes the CPU w/ context switches and doesn't work very well unless you have a whole lot of cores.
Something like gotosocial that has proper non-blocking IO would be a lot cheaper to operate on a smaller computer, but bandwidth will always be an issue I guess.
It sounds like you are talking about making a CDN, kind of like how PeerTube can have videos "seeded" by random volunteers on the net ?
Such a thing could even be layered on top with JavaScript, without needing to modify the server