Lock-free is not the same as wait-free. > Web application servers will often hit...

alextheparrot · on July 30, 2020

> lockless doesn't compose well

Are you familiar with any resources on this?I've noticed this when interviewing candidates and asking follow-ups related to how to make an LRU cache thread-safe. Many, many candidates usually reach for converting their HashMap into a ConcurrentHashMap, which effectively buys them nothing due to the point you raised.

Seems like a missing abstraction that makes it hard to compose these structures, and therefore construct complex structures. Another thing to the research list, irrespective. Or maybe it is naturally the case that the implementations of composition for lockless doesn't scale well?

I still need to read GP's article from 1024cores which seems to potential get into this as well.

wahern · on July 30, 2020

I'm not an expert in this area, I've just read enough literature on software transactional memory to understand some of the fundamental limitations of modern hardware architectures.

This is one of the few areas where a subscription to the ACM Digital Archive is priceless. It's been several years since I went down the rabbit hole, and many new data structures have since been published. If there's a decent self-contained solution, there's a good chance it's been published there. Any particular paper is likely floating around on the Internet, but efficiently sifting through the literature benefits from friction-free access to all the citations.

senderista · on July 30, 2020

If you have a DOI, sci-hub can generally find it for you. I consider paywalls mostly a solved problem at this point.

mark254 · on July 30, 2020

Another interesting article here is "MCS locks and qspinlocks": https://lwn.net/Articles/590243/

molodec · on July 30, 2020

Can you elaborate on the HashMap? I have used java.util.concurrent.ConcurrentHashMap for caches on machines with up to 16 cores. I am actually very impressed by the implementation every time I use it. Easy to use, and the performance is excellent. I've never had a chance to use it on 128 core machine, and may be my experience is limited, but I think that for the majority of use cases using ConcurrentHashMap for caches is a solid choice.

dastbe · on July 30, 2020

unless I'm misunderstanding terribly, the issue is that an LRU cache is composed of both a list of ordered cache entries and the map of data itself. Converting just the map to a threadsafe implementation doesn't actually fix the thread safety issues.

nyanpasu64 · on July 30, 2020

> Seems like a missing abstraction that makes it hard to compose these structures, and therefore construct complex structures. Another thing to the research list, irrespective. Or maybe it is naturally the case that the implementations of composition for lockless doesn't scale well?

Would transactional memory make lock-free algorithms compose better, or just race/conflict more?

NovaX · on July 30, 2020

The vast majority of STMs use lock-based transaction management. There are lock-free designs, but these tend to perform worse due to creating a lot of memory bus traffic. Unfortunately lock-free algorithms can easily perform worse than their lock-based counterparts. There are often simpler alternatives to reduce lock contention pitfalls on hot paths (like map reads), while still taking advantage of locking (like map writes).