Environmental Audio As A New Machine Sensor
First time poster on Reddit. I am mad home geek myself and then I did something crazy. Now I'll trying to make everyone else crazy too. That's the AI world we find ourselves in.
I am pretty good with computers - always have been. I was retired lol… but got chatting with a friend who mentioned an idea about noise complaints; "Wouldn't it be good if you could have an app to simplify noise complaints, e.g. dogs or other disturbances". The stars aligned. I've finished working for the Aussie Government on big data infrastructure and have a deep history in technical delivery of a rich set of business domains. I know cloud at enterprise and microscopic level. Claude Code and Vibe coding has arrived. BOOM.
In 6 months, I solo vibe coded, a government security grade, elastic cloud compute pool, with a hyperscale pattern. I designed a multi-Model Classifier as the compute unit, to host YamNET, PANNS and BirdNET. I added a spatiotemporal annotation UX with acquisition through any device spot recording and upload of any old media. I then vibed transformed the raw AudioSet (YamNET, PANNS) with a temporal relationship algorithm which enriches the Classification meaning - that is not 5 dogs barking, it is a barking Event.
I vibed from the cost per minute layer offered by cloud for compute units, up to API, through into the MCP layer, aka AI. Talk integrate via Claude, OpenClaw, etc. Talk classify and near real-time, webhook notifications on 500+ untapped (AudioSet) environment signals from the ubiquitous microphone.
A new machine sensor? The pervasiveness of microphones coupled with ridiculously cheap transform, and AI talk to integrate.
A new semantic compressor? Audio 100:1 or more reduction in file size yet full of semantic weight. Perfect sized AI brain food.
How I vibed it is also pretty cool. I do know my tech stuff. But one person semantically programming what I would consider the best DevSecOps SDLC chain I have ever seen… that is a story too.
Hopefully you find this as cool as I do. Hopefully everyone can see what i see ????
Agentic empowered spatiotemporal annotated environmental audio
https://www.h-ear.world/how-it-works or https://www.h-ear.world/use-cases.
Super interested in people's thoughts. It'll make a nice change from an AI prompt.
[link] [comments]
Popular Products
-
Adjustable Shower Chair Seat$107.56$53.78 -
Adjustable Laptop Desk$91.56$45.78 -
Sunset Lake Landscape Canvas Print$225.56$112.78 -
Adjustable Plug-in LED Night Light$61.56$30.78 -
Portable Alloy Stringing Clamp for Ra...$119.56$59.78