Just wanted to share this, these are my $15 monoprice headphones which I bought in ~2014. They have an 1/8th inch audio cable jack instead of an integrated cable. Both sides of the plastic structure have snapped and been repaired with superglue. The speaker cones inside the headphones have come loose and been repaired. The cable has been replaced.
Today I finally lay these headphones down to rest. I detached my Antlion Audio magnetic mic mount from them and re attached it to a new pair of headphones. RIP.
Demo Day at layer zero is starting soon! just streaming some dashcam footage right now while we wait
Also, it goes without saying, but its ridiculous that if I want this capability, the only way to get it is to bootleg it like this.
Google finally achieved a technology that can transcribe spoken conversation, but they want to hoard it behind proprietary APIs and services. This bootleg is a small glimpse at what technology could be like if its goal was to provide utility instead of just make money: Accessibility technology that actually works!
Ok, ok, maybe that's a bit of a grandiose claim for something like this which is just barely demo-able and still has tons of difficult fundamental problems to overcome... But the point is, this capability would probably be commonplace already if it was open technology. The only reason this is remarkable is because I went through the effort to hack it together with DACs on both sides, special audio cables, android UI testing libraries, and heaps of good old-fashioned software duct-tape. And despite all that, it still out-performs the "Open" state of the art like "OpenAI" Whisper, while using 1/10th of the energy.
Also, here is the massive alt-text I wrote for the video :P A thorough description of wtf is going on in this video:
video of multi-user audio transcription system based on Pixel 6 phone with a neural network chip that enables the Google Live Transcribe app to convert audio into text without the ability to reach the internet.
The Pixel 6 is connected to a combo charger + microphone adapter. The Linux server has a sound card with its output connected to the microphone adapter's input, mediated by a special attenuator audio cable.
The linux server is running a web application that connects to a mumble server, enqueues audio from a conversation, and plays it back one-speaker-at-a-time.
The linux server is also attached to the android phone via ADB. The linux server join's the android phone's WiFi Hotspot. The linux server is running a "uiautomator" UI test which constantly polls the transcribed text element of the Google Live Transcribe app and posts the text to the server via WiFi.
The web application synthesizes the data of who was talking when, and what text was displayed when, in order to display the live conversation on a web-page, similar to a chat log.
These are the repos involved:
https://git.sequentialread.com/forest/mixtape-mumble
https://git.sequentialread.com/forest/mixtape/src/branch/main/mixtape.uiautomator
Here's a link to the video on my server: https://picopublish.sequentialread.com/files/mixtape-demo5.mp4
its kinda hard to hear @fack, I had my phone sitting on my headphones while recording this and it didn't work out too well
I am a web technologist who is interested in supporting and building enjoyable ways for individuals, organizations, and communities to set up and maintain their own server infrastructure, including the hardware part.
I am currently working full time as an SRE 😫, but I am also heavily involved with Cyberia Computer Club and Layer Zero