You can get a pretty good estimate depending on your memory bandwidth. Too many ...

		syntaxing 7 months ago \| parent \| context \| favorite \| on: Open models by OpenAI You can get a pretty good estimate depending on your memory bandwidth. Too many parameters can change with local models (quantization, fast attention, etc). But the new models are MoE so they’re gonna be pretty fast.