> I get ~12 tps with 16k context FWIW Ollama at its defaults with qwen3:30b-a3b ...

		yencabulator 3 months ago \| parent \| context \| favorite \| on: Tongyi DeepResearch – open-source 30B MoE Model th... > I get ~12 tps with 16k context FWIW Ollama at its defaults with qwen3:30b-a3b has 256k context size and does ~27 tokens/sec on pure CPU on a $450 mini PC with AMD Ryzen 9 8945HS. Unless you need a room heater, that GPU isn't pulling its weight.