Demo Day at layer zero is starting soon! just streaming some dashcam footage right now while we wait
Also, it goes without saying, but its ridiculous that if I want this capability, the only way to get it is to bootleg it like this.
Google finally achieved a technology that can transcribe spoken conversation, but they want to hoard it behind proprietary APIs and services. This bootleg is a small glimpse at what technology could be like if its goal was to provide utility instead of just make money: Accessibility technology that actually works!
Ok, ok, maybe that's a bit of a grandiose claim for something like this which is just barely demo-able and still has tons of difficult fundamental problems to overcome... But the point is, this capability would probably be commonplace already if it was open technology. The only reason this is remarkable is because I went through the effort to hack it together with DACs on both sides, special audio cables, android UI testing libraries, and heaps of good old-fashioned software duct-tape. And despite all that, it still out-performs the "Open" state of the art like "OpenAI" Whisper, while using 1/10th of the energy.
Also, here is the massive alt-text I wrote for the video :P A thorough description of wtf is going on in this video:
video of multi-user audio transcription system based on Pixel 6 phone with a neural network chip that enables the Google Live Transcribe app to convert audio into text without the ability to reach the internet.
The Pixel 6 is connected to a combo charger + microphone adapter. The Linux server has a sound card with its output connected to the microphone adapter's input, mediated by a special attenuator audio cable.
The linux server is running a web application that connects to a mumble server, enqueues audio from a conversation, and plays it back one-speaker-at-a-time.
The linux server is also attached to the android phone via ADB. The linux server join's the android phone's WiFi Hotspot. The linux server is running a "uiautomator" UI test which constantly polls the transcribed text element of the Google Live Transcribe app and posts the text to the server via WiFi.
The web application synthesizes the data of who was talking when, and what text was displayed when, in order to display the live conversation on a web-page, similar to a chat log.
These are the repos involved:
https://git.sequentialread.com/forest/mixtape-mumble
https://git.sequentialread.com/forest/mixtape/src/branch/main/mixtape.uiautomator
Here's a link to the video on my server: https://picopublish.sequentialread.com/files/mixtape-demo5.mp4
its kinda hard to hear @fack, I had my phone sitting on my headphones while recording this and it didn't work out too well
We will never accept an invitation to speak to Meta. We are not interested in speaking to Meta. We're not even on Meta's radar, but if they do for some godforsaken reason reach out to us, we will promptly tell Meta (more or less) to fuck off. That's the #GoToSocial promise, baby! Death to capitalism!
Analysis of Samsung FRP Bypass
https://blog-cyber.riskeco.com/en/analysis-of-samsung-frp-bypass/
"During this analysis, we learned that on Samsung devices, we can access a second USB configuration that opens a serial communication, allowing us to send AT commands. Moreover, some combinations of AT commands can lead to the activation of the USB Debugging before setting up the phone and thus without enabling developer options. Finally, using adb, it is possible to modify some settings in order to make the phone believe that the setup following a factory reset is done. This bypass the FRP since the FRP check is made at the end of the setup."
The analysis also comes with a git repos summarising it all into a nice python script.
My favorite part of the web is that it makes things hackable. Stuff like userscripts or even being able to muck around in devtools makes it so much more empowering than native apps that give you a "take it or leave it" approach to apps.
Mobile apps make me so mad because they give me no control over my device even with the prospect of "app permissions".
I am a web technologist who is interested in supporting and building enjoyable ways for individuals, organizations, and communities to set up and maintain their own server infrastructure, including the hardware part.
I am currently working full time as an SRE 😫, but I am also heavily involved with Cyberia Computer Club and Layer Zero