Show more

now working on access token validation for upload, which will need some work on the <local> server bit too, accessing the synapse db for things.
But this standin works, much secure such wow :P

Show thread

did a bunch of code cleanup and refactoring, tomorrow hoping to get started on thumbnailing, and url previews after that, which would finish the spec compliance for the media repo :)

Show thread

nice nice nice file uploads work very well now, properly stored where Synapse would normally expect them too, and a listing in the database

Show thread

now to limit the in-memory cache on <remote> and check up the federation spec, and then /upload and /download are fully implemented

Show thread

I love how the server-server spec for the media repo is "yeah just call their client-server endpoint" which is also how i implemented it already

Show thread

ah fuck i reallly stumbled into a bad wormhole doing punycode domain testing, seems neither Synapse, Conduit nor matrix-media-repo can fetch media from those properly (and I can :3)

Show thread

when the punycode is in the url literally like mxc://xn--puny--59d2hgc.dev.cthu.lu/testmedia, Synapse still fails, but Conduit and matrix-media-repo do fetch it correctly.
None of them do url decoding of the mxc right however..

Show thread

nevermind i can't be bothered anymore, I'll just make it remove the oldest entry...

Show thread

the most basic of caches, it's a Map with an array tracking access order, removing the oldest accessed item from the map when it's about to get bigger than maxEntries

Show thread

hmm debug() logging says trans rights? and the Validator says you're valid too!

Show thread

:O I think I implemented the whole spec.matrix.org/unstable/serve Matrix server discovery flow!!! It's rather complex, with .well-knowns and SRV records and combinations of those.
Also nice that I could fork of the client-spec counterpart already made by someone else :) npmjs.com/package/@modular-mat

Show thread

now just have to wait for some responses from a few homeservers asking if I could add their servernames to my example.js, to show off the different flows and the method response you get

Show thread

swapping out client discovery for server discovery in synapse-media-proxy makes it so I actually follow spec correctly there :)

Show thread

some basic memory usage reporting, but memory management is an enigma so can't really see immediate free-ing when removing stuff from cache etc

Show thread

soo good to just come across a library that does what you need to *perfectly*. I was messing about with regexes to parse Content-Disposition stuff, and with this library I can do both the parsing and formatting sooo much nicer (and it's used by express.js so it's Good(tm))
npmjs.com/package/content-disp

Show thread

I have a nice ttl invalidating cache for server lookups, and the content-disposition lib is fully integrated

Show thread

I also used vscode's incredible git integration to split those two changes into 2 commits after I had written both, with the suuper good visual cherry-picking of lines to commit

Show thread

think I'll set up a test synapse-media-proxy soon(tm) but I'd accompany it with a testing synapse instance too, think NixOS should make it real easy to get that part up and running quick, and then I can get some real-world speedtests by just throwing test media links around :P

Monday though i suppose... i should really learn at least a bit for that fuckin midterm first

Show thread

best thing about synapse-media-proxy development was looking a lot at fokshat.jpg in full-res tbh (and some other test images)

Show thread

I ❤️ well made npm libraries, `sharp` accepts both buffers and streams (directly from a remote media proxy), and JUST WORKS

Show thread

And now you just get a proper error when trying to thumbnail an unsupported file (like a .txt lol), instead of crashing the server with an uncaught error :")

git.pixie.town/f0x/synapse-med

Show thread

also lol I should fix that useragent, it's supposed to take the version from the package.json
"SynapseMediaProxy/undefined"

Show thread

/_matrix/media/r0/download/im_a/teapot now returns a picture of the Utah Teapot, with http status 418

Show thread

url previews will be fun since I can specialcase a few types of urls (like youtube) that give totally unusable results currently (just a "Before you continue" instead of the title)

Show thread

got started on the test deployment, great to do so with NixOS.
Already discovered and fixed some bugs but now turns out Synapse still won't serve my injected media so that needs more investigation tomorrow :/

Show thread

aaaaa I got an absolute superthought under the shower on how to speed up concurrent access of non-cached media but I have a fucking meeting first before I can implement it aaaaa

Show thread

currently an upstream request stream gets piped to the first requestor, and to a buffer for the later cache, but instead I should store a reference to the stream immediately so it can be piped to new requestor immediately as well, while it's still in progress!

Show thread

ok nice nice time to get this implemented before next meeting at 12:10

Show thread

ok subscribing to streams when they come available works, subscribing to an already existing stream doesn't because some of the data will already be read-out from it (and thus removed).
And seems having multiple subscribers to the same stream isn't ideal either as varying network speeds/stream consumption would give a similar issue, hmmm

Show thread

I think I can do a cool stream splitting thing with late-joins but it'll be a bit more complex (and I have a (short) meeting in 20 mins..)

Show thread

I guess this is the second yakshaving time where I really dive deep into the internals of a Node subsystem (last time it was the module system, resulting in npmjs.com/package/@require-tra)

Show thread

I did the proper thing and looked at existing implementations! and there's a module to split a stream to multiple consumers (nice), but nothing that keeps a buffer to backfill late-joiners. This will integrate *perfectly* with my current architecture because I'm already saving the whole stream into a buffer anyways (for later cache serves)

so:
- first request comes in, upstream starts streaming to the first client
- second client requests that file while it's still streaming, it gets a new stream with the buffer up till now + then the new data
- upstream request finishes
- new clients get the whole cached buffer

Show thread

this sounds dangerously like I know what I'm doing, we'll see if my coding proves that wrong

Show thread

good news: I did not really know what I was doing!

but now it is done, another biiiiig refactor commit with the new streams architecture git.pixie.town/f0x/synapse-med

Show thread

next I do probably want to add some disk caching too so it's not all memory based

and prometheus metrics

and url previews ofc

Show thread

hope I have time to implement metrics soon and then I'll upload an image to some busy Matrix room and see it fetched by a billion other homeservers

Show thread

lol you can definitely see when I started testing things (aura is the <remote> component, cosmos the <local> server at home)
stats.pixie.town/d/stats/node-

Show thread

servers with literally just constant prometheus traffic have such pleasing straight network graphs

Show thread

the stats dashboard is interesting, there's a lot of people clicking the media link (or browsers prefetching it?) from the This Week In Matrix article i presume

stats.pixie.town/d/rPBvoh6Gk/s

Show thread

or i guess scrolling through the TWIM room backlog and their servers fetching it from there, maybe i should collect requesting hs names if that's in some header

Show thread

ranty about Synapse 

and fucking hell knowing how easy that was, I'm so fucking dissapointed in how absolutely terrible Synapse's previews are. Even though the API results are named after OpenGraph tags THEY DONT ACTUALLY USE OPENGRAPH but instead do some actually wack parsing so you get 0 usable info out of tons of sites, whereas they serve you all you have to know on a fucking platter in their opengraph tags....

Show thread
Show more

@f0x Your not sending caching headers or is that intentional?

@erikk haven't added those yet, no dunno how useful those even are when remote users (majority) will fetch it through their own media repo which caches it indefinitely anyways

@erikk synapse sends cache-control public,max-age=86400,s-maxage=86400 (1 day) I guess I could just copy that

@f0x it downloads basically instantly.... are you a wizard?

@dumpsterqueer just having it served directly from the cheap hetzner box has soo much better internet than my home :D

and it's even cached in RAM there :)

@f0x I don't know what any of this means but it looks cool

@anarchiv it's a complement to my Matrix server, which is hosted at home through not so great internet.

This alleviates a lot of the slowness by taking the spikes from image/video download on a second, much smaller server which has better internet

Sign in to participate in the conversation
Pixietown

Smol server part of the pixie.town infrastructure. Registration is approval-based, and will probably only accept people I know elsewhere or with good motivation.