Monday, September 10, 2007

3.5 Rules in Distributed Algorithm Design

In highly concurrent distributed computing their are all kinds of algorithms one can learn. One can read books like Concurrent and Distributed Programming in Java and Distributed Systems: Principles and Paradigms. One can learn how to write scalable servers by reading things like these papers on SEDA. But when creating a distributed algorithm their are 3 simple Goals/Rules you need to follow. While these rules were developed with the Terracotta approach in mind they are mostly applicable to any distributed computing approach:

  • Use algorithms that allow for maximum concurrency
    • Seems obvious but design your algorithm to allow maximum concurrency. In a single JVM concurrency is important. In a distributed environment where lock acquire and release is almost certainly more expensive it is that much more important
    • Use read locks where possible. If you can have multiple readers look at something, duh, don't stop them.
    • Try strategies like striping, lock on fine grained objects etc (this is some of what concurrent hash maps do).
  • Minimize chatter
    • Many really cool concurrent algorithms are less good for distributed computing because of the amount of chatter (data that needs to be communicated between nodes).
    • Algorithms that require too much cross node book keeping are a problem. Networks are slow and relatively thin pipes. Chattiness plays into that weakness and can also eat cpu.
  • Take advantage of locality of reference
    • Use algorithms that partition well
    • Use algorithms that can mostly or entirely act on local data.
    • Scale-out architectures don't have all the data everywhere. They rely on various levels of caching both near and far for optimal performance. When hitting data try to hit data in the same node where possible.
Rule 3.5 exists as well. Remember not to guess at the performance of something. Test it, time it and automate as much as possible of those tests and timings.

This has been a public service announcement.

1 comment:

Jon Harrop said...

I'd like to see objective and quantitative comparisons of the trade-offs involved in the different kinds of concurrency provided by languages like Java, OCaml, Erlang and F#.