Join our FREE personalized newsletter for news, trends, and insights that matter to everyone in America

Newsletter
New

Environmental Audio As A New Machine Sensor

Card image cap

First time poster on Reddit. I am mad home geek myself and then I did something crazy. Now I'll trying to make everyone else crazy too. That's the AI world we find ourselves in.

I am pretty good with computers - always have been. I was retired lol… but got chatting with a friend who mentioned an idea about noise complaints; "Wouldn't it be good if you could have an app to simplify noise complaints, e.g. dogs or other disturbances". The stars aligned. I've finished working for the Aussie Government on big data infrastructure and have a deep history in technical delivery of a rich set of business domains. I know cloud at enterprise and microscopic level. Claude Code and Vibe coding has arrived. BOOM.

In 6 months, I solo vibe coded, a government security grade, elastic cloud compute pool, with a hyperscale pattern. I designed a multi-Model Classifier as the compute unit, to host YamNET, PANNS and BirdNET. I added a spatiotemporal annotation UX with acquisition through any device spot recording and upload of any old media. I then vibed transformed the raw AudioSet (YamNET, PANNS) with a temporal relationship algorithm which enriches the Classification meaning - that is not 5 dogs barking, it is a barking Event.

I vibed from the cost per minute layer offered by cloud for compute units, up to API, through into the MCP layer, aka AI. Talk integrate via Claude, OpenClaw, etc. Talk classify and near real-time, webhook notifications on 500+ untapped (AudioSet) environment signals from the ubiquitous microphone.

A new machine sensor? The pervasiveness of microphones coupled with ridiculously cheap transform, and AI talk to integrate.

A new semantic compressor? Audio 100:1 or more reduction in file size yet full of semantic weight. Perfect sized AI brain food.

How I vibed it is also pretty cool. I do know my tech stuff. But one person semantically programming what I would consider the best DevSecOps SDLC chain I have ever seen… that is a story too.

Hopefully you find this as cool as I do. Hopefully everyone can see what i see ????

When my webcam grew ears...

Agentic empowered spatiotemporal annotated environmental audio

https://www.h-ear.world/how-it-works or https://www.h-ear.world/use-cases.

https://github.com/Badajoz95

Super interested in people's thoughts. It'll make a nice change from an AI prompt.

submitted by /u/Desperate_Chair_3252
[link] [comments]