Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

based on the samples, it really seams like anything smaller than 3B is pretty useless.


If you're doing a home lab voice assistant 1B is nice, because on a 12gb gpu you can run a moderately competent 7b LLM and two 1b models; 1 for speech to text and also text to speech, plus some for the wake word monitor. Maybe in a couple of years we can combine all this into a single ~8b model that runs efficiently on 12gb gpu. Nvidia doesn't seem very incentivized right now to sell consumer GPUs that can run all this on a single consumer grade chip when they're making so much money selling commercial grade 48gb cards.


Hui for the activation word?

Shouldn't there be some hardware module be available similar to how Alexa, Siri and Google do it?

Whith a ring buffer detection the word without recording everything?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: