based on the samples, it really seams like anything smaller than 3B is pretty us...

hadlock · 2025-05-01T20:10:28 1746130228

If you're doing a home lab voice assistant 1B is nice, because on a 12gb gpu you can run a moderately competent 7b LLM and two 1b models; 1 for speech to text and also text to speech, plus some for the wake word monitor. Maybe in a couple of years we can combine all this into a single ~8b model that runs efficiently on 12gb gpu. Nvidia doesn't seem very incentivized right now to sell consumer GPUs that can run all this on a single consumer grade chip when they're making so much money selling commercial grade 48gb cards.

Dlemo · 2025-05-02T08:00:30 1746172830

Hui for the activation word?

Shouldn't there be some hardware module be available similar to how Alexa, Siri and Google do it?

Whith a ring buffer detection the word without recording everything?