Tuesday, August 18, 2009

Welcome EHCache Community

I'm excited to be welcoming Greg Luck and the EHCache community to the Terracotta family. EHCache is an extremely useful/usable product and nearly ubiquitous in the caching space. Greg has spent years solving the important real world problems associated with building highly performant applications. The Terracotta Dev team is very much looking forward to helping accelerate EHCache's development as well as provide the best possible integration with the Terracotta product family.

EHCache will remain under the Apache 2 license and we have created the beginnings of a new website at www.ehcache.org. Greg will continue to drive EHCache's vision and direction, as well as being highly involved in it's development. He will also be instrumental in helping Terracotta to define and build out our caching strategy as a whole. His vision, as well as the EHCache community's help are essential in allowing us to together take these products to the next level.

We see a great future of product offerings for your desktop app, on your servers and in your cloud solving the scale/performance problems of today, tomorrow and beyond.

Wednesday, August 12, 2009

Distributed Data Structures: ConcurrentDistributedMap

Concurrent Distributed Data Structures?

Many challenges exist when developing a high scale multi-node application. Our goal at Terracotta is to take on those challenges in ways that remove them from the plate of those architecting and developing applications and place them squarely on our shoulders.

In order to accomplish such a lofty goal we first had to create some core pieces of infrastructure on which many higher order abstractions could be built. One such "piece" is our ConcurrentDistributedMap. This data structure is a fundemental piece of our Distributed Cache, our Hibernate product and our Web Sessions product and is also available for use in custom solutions for those using Terracotta as a platform.

Challenges and Tradeoffs

Developing a data structure that is Distributed as well as Concurrent and Coherent has very different trade-offs from developing for a single JVM. If one took a standard concurrent data structure like ConcurrentHashMap and just clustered it "as is" one would likely run into performance and memory efficiency issues. Even a really cool concurrent data structure like Cliff Click's Non Blocking Hash Map would not do well if one used the algorithms without thought in a coherent cluster.

The challenge is that the trade-offs change when you add the latency of a network and data locality in the middle of the game. In normal concurrent data structures you care about:

- How long you hold locks
- How much is locked while you hold it.
- CPU usage
- Memory Usage and Object creation

In the clustered case you add the following:

Lock locality - Is the lock you need already held on the local machine or do you need to go get it over the network. If you need to go get it how long does that take. While a little of the question of "How long does it take to get the lock" exists on a multi-cpu single machine it's not nearly to the same degree.

Data locality - Is the data I need already local or do I need to go get it. If I need to get it how long does that take

Data change rate - How much clustered data am I changing and how long does it take to send it around? Also, do I send it around?

Data size - In a clustered world one often uses data structures that don't fit entirely in a single node. One has to take pains to control the size and amount of the data in each JVM for efficiency.

There are other implementation specific/point in time issues like number of locks and their cost but those can mostly be optimized away at the platform level.

Single JVM ConcurrentHashMap

ConcurrentHashMap adds concurrency by collecting groups of entries into segments. Those segments are grouped together both from a lock perspective, they share a lock, and from a physical space perspective, all entries in a segment are generally in one collection. In a single JVM the only risk of sharing a lock between the entries is that one can contend on the in-memory speed look-ups. This is a very effective way to handle large numbers of threads making highly contended gets and puts to the map. If one runs into contention with this kind of data structures one can just up the number of segments in the Map.

Concurrent Map In A Clustered World

In a clustered world problems occur with a data structure like this. First, getting a lock or an object can be either in-memory speed or take many times in-memory speed depending on whether it has recently been accessed locally. In some cases this is no problem and in some cases it's pretty bad. It's also a space issue. If a segment is brought in as a whole and it's entries are in that segment strictly because of it's hashCode then the natural partitioning of the app's usage won't help save space by only loading the entries needed locally. Instead it will load the needed objects and anything else in it's segments. This elimenates the benefits of any natural or forced locality that occurs in a multi-node application.

Use-Case Analysis

In order to highlight some of the pro's and con's of CHM (ConcurrentHashMap) I'm going to vet it against a few use-cases.

Use-case 1 - An 8 node app sharing a clustered ConcurrentHashMap

All the data in the map is read only and it's used in all nodes evenly and the data fits entirely in a single JVM's heap.

GOOD NEWS! you will be fine with a regular clustered ConcurrentHashMap. Lets look at why.

1) All data will be loaded everywhere so unnecessary faulting (the act of pulling a data item into a node) won't be happening
2) All locks will be read locks and will be local everywhere so your latency will be nice and low (Due to greedy locks)
3) Won't have contention on the segments because reads are pretty much concurrent

Use-case 2 - The same as use-case 1 but now the map data is bigger than memory and you have a sticky load balancer.

Some good and some bad:

1) Since data is batched into segments by hash code and your load balancer hashes on something completely different than your map hashes on you will end up loading data into each node that is not needed. This is a result of the ConcurrentHashMap segmenting strategy.

2) Locks will still be fine because it's all read and read locks are very concurrent so segment contention won't be an issue.

So the memory manager may be doing unnecessary work and whether you will be in trouble depends on how big the ConcurrentHashMap is

Use-case 3 - Same as use-case 2 with the exception that now we are doing 50 percent writes. Something similar to caching conversations.

1) Still have the above problem of loading unneeded batches
2) But now, due to the writes, you are also maintaining the state of the objects that have unnecessarily poor locality in all the nodes where they don't belong.
3) Now you have a locking problem. While writing an entry to a segment you are blocking people in other nodes from reading or writing to that segment adding some serious latency. Plus the locks are getting pulled around to different nodes because even though your load balancer provides locality it is on a different dimension that of the internals of the map and is therefore not helpful.

Reviewing the problems highlighted by use case 3:

- Lock hopping leading to slow lock retrieval
- Lock contention due to grouping of multiple unrelated entries with locks.
- Faulting and Memory wasting due to unfortunate segmenting of data
- Broadcasting of changes or invalidations to nodes that shouldn't care

What did we do?

We built a specialty highly concurrent map tuned for distribution and the above challenges call ConcurrentDistributedMap.

Instead of breaking things down into segments for locking we lock on the individual keys in the map. This gives the same correctness guarantees while giving the maximum concurrency. This drastically reduces lock hopping and contention and provides in-memory lock speeds most of the time.

The segments go away completely. Key Value pairs are managed on an individual basis so no unnecessary faulting occurs.

Broadcasting and invalidation:
The above, plus an efficient memory manager means that values are only faulted into nodes where they are used. Since those values aren't in all nodes anymore invalidation and or broadcasting of changes for those entries is no longer needed.

This data structure takes excellent advantage of any natural partitioning that may occur at the application level.


Building a fast, coherent, concurrent, distributed data-structure requires thinking about an extended set of concerns. However, if one pays attention to the issues it is possible to create a highly useful solution. To learn more check out the ConcurrentDistributedMap described above.

Additional Reading:

For more information on Terracotta's distributed data structures one can always look here: