Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Compared to Zen4, Zen5 has much higher latency between cores on different chiplets. It's possible that a scheduler + app combo could regress on Zen5 for this reason. It sounds basically impossible for single-threaded apps to be impacted because timeslices are really big. Multithreaded apps where threads are communicating constantly could easily run slower if the scheduler sees all the cores as identical and interchangeable.

I don't know much about this topic, but it seems like Windows uses Processor Groups for scheduling[0], and generally tries to fit each NUMA node into 1 Processor Group (as long as it has at most 64 cores in it). Since the issue here is latency between chiplets and no NUMA is involved, all the cores go in the same Processor Group.

[0]: https://learn.microsoft.com/en-us/windows/win32/procthread/p...



This sounds like a trivial fix: put the two chiplets into separate processor groups, because that's effectively what they are.

This feature was originally about non-uniform memory access (NUMA), but effectively it is "core-to-socket" mapping. If a processor has chiplets on it, then it's effectively sockets-within-sockets. The software needs just a minor update to consider the chiplets to be the scheduling boundary instead of the AM5 socket.


Windows 'processor groups' aren't at all similar to linux NUMA aware scheduling, which is the proper method regardless, There's 64 bits that represent all cores on the system, setting those bits defines how a process is assigned to a core. The 'processor groups' is a hack that keeps the same bitmask that they originally used.

Nowadays windows can and will schedule across processor groups as per https://learn.microsoft.com/en-us/windows/win32/procthread/p...


Default locking every process onto a random chiplet doesn't sound like a great plan either.


AFAIK it doesn’t lock them, it just preferentially co-schedules things into a socket.


My understanding was that a thread is only eligible to be scheduled in a single processor group at any given time, and that windows will not change the group. Is that wrong?


That WAS correct. They corrected that after realising a 96-core processor has less cores available than a 64-core processor since processor groups split cores evenly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: