Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There’s a hit that comes from “frequency scaling” on avx512 and avx2 instructions on Intel (worse for 512), so a total of fewer instructions isn’t always worth it. IIRC, AMD doesn’t pay a cost for avx2, but I don’t know how it works.


They pay a cost for it in sense of having to clock lower if they run a lot of AVX code, but they advertise their AVX2 clock as their base clock, and more importantly, they can change clock at a much finer granularity and only have to change after a latency period.

The problem with the AVX2/AVX512 clocks with Intel is not the fact that they must clock lower to use them (for pure AVX code, running wider but at the lower clock speed is still worth it!), it's that they need to clock lower pre-emptively for any such instructions, and must remain at this lower clock for a while. This means that code that executes a few AVX instructions every now and then mixed in with a lot of integer code needs to run the whole program at a lower clocks.

In contrast, AMD runs their chip power supply from a huge mimcap built in the chip, meaning that they have margin so they can start executing exceptionally power-hungry instructions, and only clock down reactively if it's actually needed. And also do the clocking down and up with a much finer granularity, clocking up immediately after the need to clock down passes.


Ryzen also uses clock stretching where single(?) clock cycles can be lengthened on demand (power transients), which allows running stable at otherwise marginal clock speeds. I suspect that that is part of the reason why Ryzens have so little overclocking headroom - there are tiny safety margins in their clock frequency.


The rest of you folks make me feel dumb - but nice posts, I'm doing a lot of Googling now...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: