Sunday, September 25, 2011

New Home...

Hey readers. Just a heads up that as of today I'm doing my posting over at http://scaleaholic.blogspot.com. Check it!

Monday, August 01, 2011

What Is Terracotta?

One of the biggest challenges in software is telling people what your software is in a way that helps them make decisions about it. This challenge can often be as difficult as designing and building the software itself. The Terracotta team recently hooked up with the gang from Epipheo studios to create a 2 minute video to do just that. We spent a lot of time thinking about and refining things in order to get a clear succinct message. I think it came out quite well.


Check it out:


Monday, July 11, 2011

Easy Java Performance Tuning: Manage Objects By Size On The Java Runtime Heap


The Problem?

For those of you who use caching with things like Hibernate, Spring or anything else, you know what a pain performance tuning can be. The tools you have available for tuning, resource management and avoiding OOME's boil down to counts and age controls. These are actually not resource management controls at all. They are data freshness controls and should be treated as such.

How do you even begin to figure out how many entries to allow in each Hibernate cache when you have a hundred of them? What if things change and objects get bigger, or smaller. What if you change your heap size and/or usage patterns?

You've Gotta Try This...

With Ehcache 2.5 Beta1 you can just specify a percentage of heap or a heap size in bytes that your graphs of Java objects are allowed to use. This is accomplished by passing a simple size description into a Cache or CacheManager. All objects held onto by the cache will borrow their space from the cache's or manager's specified pool. Entries will get evicted from the cache as space runs low without any intervention from the developer. This can be done at the Cache Tier level whether it's on heap, disk or BigMemory. It's another way to reduce tuning, improve performance while avoiding OOME's. Learn more here.




If you do things in config or in code it's just a one line change:



A few More Details...

It works on any 1.5 or 1.6 JVM (Tested on Oracle, JRocket and IBM). Doesn't require any object serialization for the on heap management.

Learn More:


Try it out and give us lots of feedback on the Terracotta Forums:


Wednesday, April 20, 2011

Local Caching++

Ok, so you've built your application in Java. You've used all the usual tools. Tomcat, Spring, Ehcache, Quartz, etc. Or maybe you went the JRuby, Grails or Scala route. You test your new application or hand it off to run in production and it's too slow. This is just a single node application at this point. It services 20-100 users. It's churning and burning the database, creating and recreating the same Web Pages, Users and other relevant data. You want to start caching locally to solve your latency and throughput problems. Then, upon a more detailed look at your application, you get scared.

You find your application has:
  • 40 DB tables in Hibernate that can be cached
  • A web cache
  • A user session cache
Then it hits you. Caching is easy but cache tuning is hard.

What Makes Cache Tuning Hard?

In conversations with 100's of cache users there are actually a small handful of difficult to work through challenges applying caching to an application:

  • Hibernate/Lots of caches - When using Hibernate you often end up with as many as 100 tables in your DB. How do you balance a fixed amount of resources(Heap/BigMemory) across 100 caches?
  • Indirect knobs/Bytes vs Count/TTL - In local Java caching the control points are almost always measured in number of entries and time to live. But wait a minute! When I start the JVM I don't say how many objects the heap can hold and for how long. I say how many bytes of memory the heap can use?
  • Who Tunes and When? - At some companies the desire is to have the "Application Administrator" do the tuning. At others it's the "Developer." They have different understandings of the application. The developer can tune by knowledge of the application. The app admin can only tune based on what's happening when the application is running.
These are the challenges we are working to solve in the next version of Ehcache. While it's early days on the dev side we would love feedback on our approaches. You can get a sense of how it's going to work from the doc on ehcache.org.

What We Are Building

Greg, me and the dev team spent a bunch of time pondering the above problems over the last few years. We felt that with two key improvements to how people tune we could address most of the above (and a few more items hit the rest):

  • Tune from the top - Define max resource usage for the whole cache manager and then optionally define it for the individual caches underneath it as needed. So if you have a hundred caches you can start with, "Give these 100 caches N amounts of Heap/OffHeap." Then monitor and see if any specific caches need special attention.
  • Tune the constrained resource, Bytes - TTL is a cache freshness concern not a resource management concern. Max entry count does not directly map to available heap resources. So we are adding "bytes" based tuning. This eliminates the mistake prone process of trying to control resources by TTL/TTI/count and hope you get it right. Instead you say, "I want to allow caching to use 30 percent of heap." We take it from there.

Wrapping Up

With those two key improvements a developer or admin is now directly turning the knob (Size of cache in bytes) that maps to the resources available in the JVM and doing it at a global level or a local level as needed to avoid hard to tune individual cache constraints.

This will work with all of our JVM level cache tiers (onHeap, BigMemory, Disk).

When you put those features together with other items coming in the next major release like entry and cache pinning and a snapshotting bootstrapper for warming we feel like this will be a very powerful release.

Please check out the new docs and give us feedback by commenting on this blog or posting the the Ehcache forums.

Help us make Ehcache as easy to use and powerful as we possibly can.

Saturday, April 16, 2011

Where To Buy Your Apple Gear?

Most people who know me think I'm a bit of an Apple products junkie. I can't deny it. I'm a big fan of Apple products (Typing this blog on my 11.6 inch Macbook Air) and the way they package and reuse great ideas across devices. Being that I'm such a fan(boy) of the Apple ecosystem I'm regularly asked for advice from wood be purchasers when they want to pickup an Apple product. I'm writing this blog in order to provide that product purchase advice for my friends and family as well as for my readers who can't ask me verbally.


"Speed of Acquisition" vs "Help During Purchase" vs "Total Cost"

The decision making process for where to buy Apple products usually comes down to three major questions. How do you rate "Speed of Acquisition" vs "Help During Purchasing" vs "Total Cost?" (See diagram for the big picture on this issue). Once you know the answer to the above question based on your keen sense of self awareness you can then weight the various purchase options appropriately. For best results, Be Honest With Yourself!

NOTE: Many people think after sale help is somehow related to the purchase decision. IT IS NOT! The Apple store Genius bar will always help you no matter where you bought the device.


What's In The Cost?

When buying an Apple product there are 4 components to the cost of the device:
  1. The price tag of the device itself
  2. State sales tax
  3. Shipping and Handling
  4. What you can get thrown into the deal
Take into consideration all four points when looking at the price. You can often save over a hundred dollars in sales tax just by ordering from an out of state, online retailer! In addition, many of these online retailers ship for free. These online retailers usually have the best discounts aka "price tag" on devices as well. Check out the AppleInsider.com price guide. It will tell you where to get the best deal on any given Apple device and is kept very up to date. It's where I look and where you should too. Another thing to consider is that on some items MacMall and MacConnection are actually flexible and may throw in things like printers and bags if you call.


The Need For Speed?

Some people are impatient, some people need a computer right away, and others just like the experience of being first. These are all speed questions. In the speed world there are two categories of Apple products "Hot and Constrained" and "Generally Available."

Hot and Constrained

If we are talking about "Hot and Constrained" usually the fastest way to get a product is buying it from Apple. Either through their online store or going to an Apple Retail location. They stock themselves first. You may have to wait on lines but it's often your best and sometimes your only bet for acquiring new, hot, Apple devices.

General Availability

For "general availability" devices "fastest" is broken down again into two categories. Fastest online (Generally Amazon) and fastest brick and mortar (Generally Apple's stores). Amazon is generally the best place to buy anything online as far as fulfillment (How fast they get it to you, amount in stock, ease of using store, return policies) and Apple products are no exception. Amazon's prices aren't the absolute bottom but they are pretty darn good.

When buying from a retail store, Apple's is second to none. They get you in and out fast, have all the information you need, it's easy to find what you want and they have ton's of stock of everything Apple. The only downside is the sales tax and no discounts to speak of.


Help During Purchasing

This one doesn't apply to me as when it comes to Apple products I tend to know what I want. It's a hobby of mine to monitor Apple's products and product direction. I use product and business learning from Apple for inspiration in my job building Performance and Scale software (Probably a good blog topic as well. "How I Apply My Apple Learnings To My Business"). For the rest of the world, aka "normal people," who spend precious free time doing things like traveling, dating, hanging out with friends, help might be required. For those people you should probably lean towards either the Retail Apple Stores and or the online Apple Product specialists (MacMall and MacConnection) where they know everything about the devices and can help you make good decisions. For those who are like me, stick to "Total Cost" and "Speed" in your decision making process.


Conclusions

When looking to buy an Apple device one must evaluate a number of options. You should evaluate those options against a keen sense of self awareness? Can you wait, are you cost conscious, do you need help? By answering those questions and comparing the results against the "Apple Product Purchasing Guide" diagram below, the AppleInsider price guide and the above criteria you should be able to make the best possible decision for you.

Please let me know if you have any feedback about this blog. How can it be more helpful? Where did I get it wrong?



Wednesday, April 13, 2011

Please Strengthen My Weak LinkedIn Links...

State Of LinkedIn

I've had a LinkedIn account for many years. I find it an excellent tool for keeping tabs on ex-coworkers, recruiting new ones, and watching job trends. The problem I have is that those links/connections in LinkedIn can be weak. It doesn't take those connections to the next level by helping me manage the relationships. It doesn't help me manage and monitor the strengthen of my connections.

What Does Manage and Monitor Relationships Mean?

At a macro level I can keep track of my first level connections and second level connections. I can traverse these to help me find people to hire, ask for advice and even look for jobs myself. It even does a ton of stuff to help me find new connections by making suggestions, forming groups etc. But at a micro level it does nothing to help me improve and make value judgements about those relationships. I.e. I have a relationship with Joe the CEO of SomeCompany.com. By looking at that link I can't tell if that's a strong relationship or a weak one (Monitoring). I also can't set goals of improving my relationships and keep track of those goals (Managing). It just gives me a tree/graph of the relationships as if they were all the same. Further, when I'm traversing through my relationships and into my connections relationships I can't tell how strong they are connected to their connections.

What I Want...

What I really want is for LinkedIn to help me manage and improve my relationships as well. It would all be much more powerful if I could apply a rating to each relationship, monitor the relationship's progress and assign goals for each relationship. I could do so on a 1-10 scale.

So instead of a graph like this:



I could have a graph like this where the thickness of the lines indicated the quality of the relationship where it is right now:

That would be a great start. Then I would like it to keep a history of the relationship as well. It could show me how the relationship is progressing over time. Say it started at a 10 but I haven't talked to Joe in 3 years and now I rate it a 4. It could represent it with little arrows or color showing deterioration.

What about goals for a relationship? I could set a goal of 8 and do a query on all relationships that are not at the level I wish they were. It could remind me to make a quick contact with the person in question. It could auto deteriorate a relationship over time if I don't contact people.

Summing Up...

LinkedIn is great. I'm more connected than ever. But... I want to monitor and manage my links to improve my connectedness to certain people and I'm hoping linked in can help me.

Thursday, March 24, 2011

"What" "When" and "Where" ... Quartz Scheduler 2.0 Goes GA

What is Quartz?

Quartz is The Lightweight, Open Source, Enterprise-class Job Scheduler for Java. For years Quartz has put the "When" in Java applications via it's full featured scheduling capabilities. As a result, from Spring to JBoss, Quartz is embedded in just about every major Java product. Quartz is extremely robust and full featured providing things like HA, Transaction Support and all the precise guarantees one needs to assure reliable Job execution.

Terracotta has done a number of incremental improvements since taking over stewardship of Quartz. We've been evolving Quartz one step at a time while collecting what features the Quartz community were really interested in.

Quartz 2.0 is the realization of those user requested features!

"What" "When" and "Where"

Let's step back a bit and review where Quartz fits in the Terracotta world view. As most people know Terracotta is focused on adding Snap-in performance to JVM based applications through Scale-out (Terracotta Server Array), Scale-up (BigMemory) and Speed-up (Ehcache). We do this while maintaining our goals of simplicity, predictability, and density (can your datastore hold 2 billion entries per node in-memory?) throughout our product line. We view this data layer scaling layer as solving the "What" or the data problem of an application. But when going from 1 to many nodes that's just the first of two major scale points.

Up until now, Quartz Scheduler has been solely focused on the "When" part of code execution. Making sure "code execution" happens exactly "When" it's supposed. It has been pretty much ubiquitous in that space with literally 100s of thousands of users. While it Scaled-out with JDBC and Terracotta, it gave precious little control over where jobs execute.

With Quartz Scheduler 2.0 we've now added that "Where." This is about where code executes. Instead of randomly choosing a node for job execution in a scaled-out architecture you can now create constraints that help make the decision based on "Where" it should be executed. You can do this based on:
  • Resource Availability - CPU available, Memory Available, custom constraints
  • Ehcache's data locality - Bring the work to where the data is
  • Static allocation - Just decide where it goes
This gives Terracotta the ability to Snap-in and handle both the major scale-out points.

What's new in Quartz 2.0

The simple answer is A LOT!
  • Easy to use Fluent API - Quartz 2.0 has a new, easy to use fluent interface that hides the complexity of building out the description of your jobs behind a simple description of what you want to happen and when. I wrote a short blog about this when it was in beta.
  • Quartz "Where" - Constraint based system for controlling where jobs execute based on things like CPU and Memory usage, OS, and Ehcache data locality
  • Quartz Manager - A flash based GUI console for managing and monitoring your scheduler in production.
  • Batching - Helps improve a schedulers throughput by allowing one to make trade-offs between perfect time execution and benefiting from batching.
  • Ton's of bug fixes and features - Lots of long requested features. Check out the link for the list.

Quartz 2.0 Is Now GA

Quartz 2.0 makes a big leap in usability, power, visibility and management and it's NOW GA. We did this while maintaining Quartz's reliability and predictability. We focused on making sure existing users would have an easy time moving forward so check out the migration guide if you need help. It really is the next generation of Quartz. Give it a try and give us feedback as we are constantly working to make things better!

More Reading



Monday, February 14, 2011

Quick 5:41 Intro To Ehcache Search (Now GA)

Ehcache 2.4 went GA today. Still lightweight (under 1MB) and still backward compatible to 1.x so no reason not to give it a try. Below are the highlights and a short video on getting started with Ehcache Search:
  • Search - Brand new search API. Allows one to get beyond the key based lookup of objects (Check out this sample)
  • Local Transactions - Fast optimistic concurrency without the need for a TransactionManager (Check out this sample)
  • Bigger BigMemory (ee) - 2 Billion entries, 1.3 million TPS, Extreme predictability for meeting SLA's
  • Bigger Disk Store (ee) - Swap your Ehcache to disk. Grow to hundreds of gigs with no on heap footprint
In the past I've enjoyed short videos that teach me something. This is my first try at doing one to benefit others. I've created this short 5 minute 41 second video to get people started on using Ehcache Search...


Wednesday, January 26, 2011

Ehcache At 2 Billion...

What's Up With Ehcache 2.4


Ehcache is the de facto caching standard for Java that everyone uses (500,000+ production deployments; the majority of enterprise Java applications). Ehcache 2.4 is coming out soon and includes some capabilities that will make it even easier to use, more powerful, while still maintaining it's light weight.


The highlights include:

  • Search - Quickly find entries based on the criteria of your choosing. String matching, dates, ranges, sums, averages etc.
  • Fast local transactions - Improved performance of JTA and added a new non-jta transaction api for user level control
  • Even more capacity and performance


What I've been Testing


I've written before about BigMemory for Enterprise Ehcache and how it solves the problem of long, unpredictable GC pauses in Java. The first release of BigMemory was… well, big. In Enterprise Ehcache 2.4, BigMemory has gotten even bigger.


Using the Enterprise Ehcache Big Memory Pounder I was able to show that Enterprise Ehcache 2.4 now easily handles:


  • Entry Count: > 2 billion entries (I reached 2 billion on the hardware I had; with bigger hardware, I could probably have gone much higher).
  • Throughput: 1.3 million operations per second (symmetric read and write; CPU bound)
  • SLA/Predictability: No GC pauses and a predictable 38-42 ops/thread/millisecond throughout the test
  • Data Size: 1-350 GB in-memory cache (again, I was limited by the hardware I had; with more RAM, I could probably have gone much higher)
  • Flexible Efficient Entry Sizes: The cache can now dynamically handle very large (10-100 MB) and very small entries (just a few bytes) together more efficiently with no tuning (This test used small entries in order to fit as many entries as possible into the memory I had. I also ran tests with fewer entries in order to validate wide ranging sizes)
  • Tuning: All tests were done with NO TUNING. Right out of the box.


Here's the hardware and software stack I used for my testing:


Cisco UCS C250 Server

Dual Intel x5670 2.93 Ghz CPU

384 GB RAM ( 8 GB x 48)

Redhat 5.4 Enterprise Edition

Sun JDK 1.6_22


For this test, all of the data was in memory.


A Bit About Ehcache BigMemory


BigMemory is 100% pure Java and in process with a Java application. No magic or special JVMs (works on IBM and JRocket as well). The cache data is safely hidden away from Java GC and the pauses that occur with large heaps by instead storing data in a BigMemory off-heap store.


Embedding


BigMemory got it's start as a component in the Terracotta Server Array and as a result it is particularly useful for embedding. It's performance characteristics and no tuning approach improves "The Out Of The Box Experience" and saves money on support by removing tuning required by users and problems caused by GC pauses.


You may be thinking...


"I don't have 2 billion entries in my caches?"


That's ok. Ehcache is a lightweight core library (under 1MB) for caching that's ubiquitous and easy to use. When it's needed, Ehcache lets you scale up and out to billions of entries and terabytes of data. It does so at a manageable server density without changing code/architecture and without a bunch of tuning and learning. This protects not only your knowledge investment but your code investment.


More about BigMemory for Enterprise Ehcache:


http://terracotta.org/bigmemory

BigMemory Whitepaper

BigMemory Documentation


More about the 2010 Ehcache user survey:


Ehcache User Survey Whitepaper