ok lol my synapse-media-proxy project in progress is apparently the new code thing I brainstorm about in bed and under the shower

wonder if this structure will work nicely, a local component that runs on the Synapse machine with filesystem and database access, and a remote component that caches files and handles user requests

nice, I have basic file upload to the same memory cache as other served/proxied files working now :)

now working on access token validation for upload, which will need some work on the <local> server bit too, accessing the synapse db for things.
But this standin works, much secure such wow :P

- proper database access token validation
- saving uploaded files to <local> server disk

added progress explanation so far to readme as well

did a bunch of code cleanup and refactoring, tomorrow hoping to get started on thumbnailing, and url previews after that, which would finish the spec compliance for the media repo :)

nice nice nice file uploads work very well now, properly stored where Synapse would normally expect them too, and a listing in the database

now to limit the in-memory cache on <remote> and check up the federation spec, and then /upload and /download are fully implemented

I love how the server-server spec for the media repo is "yeah just call their client-server endpoint" which is also how i implemented it already

ah fuck i reallly stumbled into a bad wormhole doing punycode domain testing, seems neither Synapse, Conduit nor matrix-media-repo can fetch media from those properly (and I can :3)

when the punycode is in the url literally like mxc://xn--puny--59d2hgc.dev.cthu.lu/testmedia, Synapse still fails, but Conduit and matrix-media-repo do fetch it correctly.
None of them do url decoding of the mxc right however..

nevermind i can't be bothered anymore, I'll just make it remove the oldest entry...

the most basic of caches, it's a Map with an array tracking access order, removing the oldest accessed item from the map when it's about to get bigger than maxEntries

hmm debug() logging says trans rights? and the Validator says you're valid too!

now just have to wait for some responses from a few homeservers asking if I could add their servernames to my example.js, to show off the different flows and the method response you get

swapping out client discovery for server discovery in synapse-media-proxy makes it so I actually follow spec correctly there :)

some basic memory usage reporting, but memory management is an enigma so can't really see immediate free-ing when removing stuff from cache etc

soo good to just come across a library that does what you need to *perfectly*. I was messing about with regexes to parse Content-Disposition stuff, and with this library I can do both the parsing and formatting sooo much nicer (and it's used by express.js so it's Good(tm))
npmjs.com/package/content-disp

I have a nice ttl invalidating cache for server lookups, and the content-disposition lib is fully integrated

I also used vscode's incredible git integration to split those two changes into 2 commits after I had written both, with the suuper good visual cherry-picking of lines to commit

think I'll set up a test synapse-media-proxy soon(tm) but I'd accompany it with a testing synapse instance too, think NixOS should make it real easy to get that part up and running quick, and then I can get some real-world speedtests by just throwing test media links around :P

Monday though i suppose... i should really learn at least a bit for that fuckin midterm first

best thing about synapse-media-proxy development was looking a lot at fokshat.jpg in full-res tbh (and some other test images)

Follow

I ❤️ well made npm libraries, `sharp` accepts both buffers and streams (directly from a remote media proxy), and JUST WORKS

And now you just get a proper error when trying to thumbnail an unsupported file (like a .txt lol), instead of crashing the server with an uncaught error :")

git.pixie.town/f0x/synapse-med

also lol I should fix that useragent, it's supposed to take the version from the package.json
"SynapseMediaProxy/undefined"

/_matrix/media/r0/download/im_a/teapot now returns a picture of the Utah Teapot, with http status 418

url previews will be fun since I can specialcase a few types of urls (like youtube) that give totally unusable results currently (just a "Before you continue" instead of the title)

got started on the test deployment, great to do so with NixOS.
Already discovered and fixed some bugs but now turns out Synapse still won't serve my injected media so that needs more investigation tomorrow :/

aaaaa I got an absolute superthought under the shower on how to speed up concurrent access of non-cached media but I have a fucking meeting first before I can implement it aaaaa

currently an upstream request stream gets piped to the first requestor, and to a buffer for the later cache, but instead I should store a reference to the stream immediately so it can be piped to new requestor immediately as well, while it's still in progress!

ok nice nice time to get this implemented before next meeting at 12:10

ok subscribing to streams when they come available works, subscribing to an already existing stream doesn't because some of the data will already be read-out from it (and thus removed).
And seems having multiple subscribers to the same stream isn't ideal either as varying network speeds/stream consumption would give a similar issue, hmmm

I think I can do a cool stream splitting thing with late-joins but it'll be a bit more complex (and I have a (short) meeting in 20 mins..)

I did the proper thing and looked at existing implementations! and there's a module to split a stream to multiple consumers (nice), but nothing that keeps a buffer to backfill late-joiners. This will integrate *perfectly* with my current architecture because I'm already saving the whole stream into a buffer anyways (for later cache serves)

so:
- first request comes in, upstream starts streaming to the first client
- second client requests that file while it's still streaming, it gets a new stream with the buffer up till now + then the new data
- upstream request finishes
- new clients get the whole cached buffer

this sounds dangerously like I know what I'm doing, we'll see if my coding proves that wrong

next I do probably want to add some disk caching too so it's not all memory based

and prometheus metrics

and url previews ofc

synapse-media-proxy serving files well :3
backed by an actual Synapse here, running on my NixOS new homeserver
plant:
media.pixie.town/_matrix/media

Show newer

@f0x don’t forget to write the idea down…

@f0x@social.pixie.town hell yeah that sounds great
Show newer
Show newer
Sign in to participate in the conversation
Pixietown

Small server part of the pixie.town infrastructure. Registration is closed.