Fun stuff, amusing that the definition of a distributed system used; "Where a co...

jacquesm · on May 13, 2014

I've never seen an RPC system that I really liked. The closest to a model of distributed computing that gets me from 'a' to 'b' without going terminally insane is anything based on message passing. Even though there is significant overhead I figure that by the time you go distributed and your target of the RPC call or message lives on the other side of a barrier with unknown latency that overhead is probably low compared to the penalties that you'll be hit with anyway.

So then the trick becomes to make sure that a message contains a payload that is 'worth it'.

Making the assumption that any message may not make it to its destination and that confirmations may be lost (akin to your return example) is still challenging but I find it easier to reason about than in the RPC analogy.

I love that Lamport quote :)

A nasty side effect of all this network business is that what looks like a function call can activate an immense cascade of work behind the scenes, gethostbyname (ok, getaddrinfo) is a nice example of such a function. On the surface it's a pretty easily understood affair but by the time you're done and you get your results back you've likely triggered millions of cycles on 'machines that you've never heard of'.

arethuza · on May 13, 2014

"I've never seen an RPC system that I really liked."

I must admit I've never seen a message passing system that I really liked either :-) Mind you that's possibly because of times making stuff work in environments where someone made the decision "you shall use message passing for all inter-system communication" even when it wasn't always the best option.

These days my practical test for a remote API is whether I can stand using it through cURL - if I can happily do stuff from the command line then the chances are that code to do stuff won't be too insane.

jacquesm · on May 13, 2014

I liked QnX, currently playing around with Erlang. (Erlang has tons of warts but it gets enough of the moving parts just right that I find it interesting).

fenollp · on May 13, 2014

One does not often hear about the warts of Erlang. What do you name those?

gritzko · on May 13, 2014

Recently I was talking with a guy doing CRDT research. His past background was something CPU design related. I always considered a CPU a Newton/Turing ideal machine. I was surprised to know that it feels more like a distributed system. Due to high frequencies, events that happen in one part of CPU are unknown to other parts for quite a while, i.e. so many ticks later that they have to act semi-independently.

cemerick · on May 13, 2014

Hi, author here. :-) Thank you for the fun anecdote and kind words.

Hopefully we can collectively get better at addressing these problems.

ChuckMcM · on May 13, 2014

Absolutely there is more fun to be had. I clearly remember that sort of "ah ha" moment when I figured out that data structures could be computation. That took me from a loop that could not operate fast enough on the data, to one where the data set had some precomputation done on it and the loop only had to 'finish' it for various conditions and was plenty fast. Suddenly large vistas of "wow" open up. The posting from Julia's blog about how computers are really fast, same sort of experience for her. Suddenly a new understanding, the world shifts, and now you have a whole bunch of new insight to throw at problems. We can't help but get better at addressing problems.

I believe it was Leslie but it might have been Butler Lampson who mentioned you could stomp on a bunch of ants and the colony still worked fine. Ants are a great example of a durable distributed system that is robust in the face of massive amounts of damage. When you start thinking about computers like that it makes you realize you can build 100% uptime systems after all. The implementation of that property (individual machines are junk, collectively they are unstoppable) was done really well inside Google's infrastructure. They got to watch it in action when a colo facility they had clusters in caught fire.