Show older

I love how the server-server spec for the media repo is "yeah just call their client-server endpoint" which is also how i implemented it already

ah fuck i reallly stumbled into a bad wormhole doing punycode domain testing, seems neither Synapse, Conduit nor matrix-media-repo can fetch media from those properly (and I can :3)

when the punycode is in the url literally like mxc://xn--puny--59d2hgc.dev.cthu.lu/testmedia, Synapse still fails, but Conduit and matrix-media-repo do fetch it correctly.
None of them do url decoding of the mxc right however..

nevermind i can't be bothered anymore, I'll just make it remove the oldest entry...

the most basic of caches, it's a Map with an array tracking access order, removing the oldest accessed item from the map when it's about to get bigger than maxEntries

hmm debug() logging says trans rights? and the Validator says you're valid too!

:O I think I implemented the whole spec.matrix.org/unstable/serve Matrix server discovery flow!!! It's rather complex, with .well-knowns and SRV records and combinations of those.
Also nice that I could fork of the client-spec counterpart already made by someone else :) npmjs.com/package/@modular-mat

now just have to wait for some responses from a few homeservers asking if I could add their servernames to my example.js, to show off the different flows and the method response you get

swapping out client discovery for server discovery in synapse-media-proxy makes it so I actually follow spec correctly there :)

some basic memory usage reporting, but memory management is an enigma so can't really see immediate free-ing when removing stuff from cache etc

soo good to just come across a library that does what you need to *perfectly*. I was messing about with regexes to parse Content-Disposition stuff, and with this library I can do both the parsing and formatting sooo much nicer (and it's used by express.js so it's Good(tm))
npmjs.com/package/content-disp

I have a nice ttl invalidating cache for server lookups, and the content-disposition lib is fully integrated

I also used vscode's incredible git integration to split those two changes into 2 commits after I had written both, with the suuper good visual cherry-picking of lines to commit

think I'll set up a test synapse-media-proxy soon(tm) but I'd accompany it with a testing synapse instance too, think NixOS should make it real easy to get that part up and running quick, and then I can get some real-world speedtests by just throwing test media links around :P

Monday though i suppose... i should really learn at least a bit for that fuckin midterm first

best thing about synapse-media-proxy development was looking a lot at fokshat.jpg in full-res tbh (and some other test images)

I ❤️ well made npm libraries, `sharp` accepts both buffers and streams (directly from a remote media proxy), and JUST WORKS

And now you just get a proper error when trying to thumbnail an unsupported file (like a .txt lol), instead of crashing the server with an uncaught error :")

git.pixie.town/f0x/synapse-med

also lol I should fix that useragent, it's supposed to take the version from the package.json
"SynapseMediaProxy/undefined"

/_matrix/media/r0/download/im_a/teapot now returns a picture of the Utah Teapot, with http status 418

url previews will be fun since I can specialcase a few types of urls (like youtube) that give totally unusable results currently (just a "Before you continue" instead of the title)

got started on the test deployment, great to do so with NixOS.
Already discovered and fixed some bugs but now turns out Synapse still won't serve my injected media so that needs more investigation tomorrow :/

aaaaa I got an absolute superthought under the shower on how to speed up concurrent access of non-cached media but I have a fucking meeting first before I can implement it aaaaa

currently an upstream request stream gets piped to the first requestor, and to a buffer for the later cache, but instead I should store a reference to the stream immediately so it can be piped to new requestor immediately as well, while it's still in progress!

ok nice nice time to get this implemented before next meeting at 12:10

ok subscribing to streams when they come available works, subscribing to an already existing stream doesn't because some of the data will already be read-out from it (and thus removed).
And seems having multiple subscribers to the same stream isn't ideal either as varying network speeds/stream consumption would give a similar issue, hmmm

I think I can do a cool stream splitting thing with late-joins but it'll be a bit more complex (and I have a (short) meeting in 20 mins..)

I guess this is the second yakshaving time where I really dive deep into the internals of a Node subsystem (last time it was the module system, resulting in npmjs.com/package/@require-tra)

I did the proper thing and looked at existing implementations! and there's a module to split a stream to multiple consumers (nice), but nothing that keeps a buffer to backfill late-joiners. This will integrate *perfectly* with my current architecture because I'm already saving the whole stream into a buffer anyways (for later cache serves)

so:
- first request comes in, upstream starts streaming to the first client
- second client requests that file while it's still streaming, it gets a new stream with the buffer up till now + then the new data
- upstream request finishes
- new clients get the whole cached buffer

this sounds dangerously like I know what I'm doing, we'll see if my coding proves that wrong

good news: I did not really know what I was doing!

but now it is done, another biiiiig refactor commit with the new streams architecture git.pixie.town/f0x/synapse-med

Follow

next I do probably want to add some disk caching too so it's not all memory based

and prometheus metrics

and url previews ofc

synapse-media-proxy serving files well :3
backed by an actual Synapse here, running on my NixOS new homeserver
plant:
media.pixie.town/_matrix/media

hope I have time to implement metrics soon and then I'll upload an image to some busy Matrix room and see it fetched by a billion other homeservers

lol you can definitely see when I started testing things (aura is the <remote> component, cosmos the <local> server at home)
stats.pixie.town/d/stats/node-

servers with literally just constant prometheus traffic have such pleasing straight network graphs

the stats dashboard is interesting, there's a lot of people clicking the media link (or browsers prefetching it?) from the This Week In Matrix article i presume

stats.pixie.town/d/rPBvoh6Gk/s

or i guess scrolling through the TWIM room backlog and their servers fetching it from there, maybe i should collect requesting hs names if that's in some header

so, i implemented most of url previewing yesterday :3

ranty about Synapse 

and fucking hell knowing how easy that was, I'm so fucking dissapointed in how absolutely terrible Synapse's previews are. Even though the API results are named after OpenGraph tags THEY DONT ACTUALLY USE OPENGRAPH but instead do some actually wack parsing so you get 0 usable info out of tons of sites, whereas they serve you all you have to know on a fucking platter in their opengraph tags....

compared to glorious synapse-media-proxy just giving you title + description + thumbnail image (and I haven't specialcased *anything*)

im going to replace synapse one microservice at a time (no fuck no I wont)

synapse-media-proxy update:

url previewing is finished!

there's now a room at -media-proxy:pixie.town, come say hi :)

- open test instance at media.pixie.town
- metrics stats.pixie.town/d/rPBvoh6Gk/s

TWIM submission for later today as well :)

that recent increase in traffic was because I uploaded this collection of cat pics from my dslr, each like 5-10Mb

using cat pictures to spice up your media testing

@f0x Your not sending caching headers or is that intentional?

@erikk haven't added those yet, no dunno how useful those even are when remote users (majority) will fetch it through their own media repo which caches it indefinitely anyways

@erikk synapse sends cache-control public,max-age=86400,s-maxage=86400 (1 day) I guess I could just copy that

@f0x it downloads basically instantly.... are you a wizard?

@dumpsterqueer just having it served directly from the cheap hetzner box has soo much better internet than my home :D

and it's even cached in RAM there :)

@f0x I don't know what any of this means but it looks cool

@anarchiv it's a complement to my Matrix server, which is hosted at home through not so great internet.

This alleviates a lot of the slowness by taking the spikes from image/video download on a second, much smaller server which has better internet

Sign in to participate in the conversation
Pixietown

Small server part of the pixie.town infrastructure. Registration is closed.