Thursday, January 7th, 2010
Cache, cache, cache! (Part 3: What to cache?)
Happy New Year everyone! Hope you could enjoy your holidays!!
Let's start this year with the third part of my little series of thoughts about caching. After my small memcached intro and thoughts about caching architectures, I now focus on the data you should consider to cache in your web application.
 Cache HTML
Obviously the biggest performance win you can achieve is by caching the whole output of your web application: a simple reverse proxy scenario. This works very well for mostly static pages, but for highly dynamical and user-specific content this is not an option: there is no advantage in caching a web page, which gets obsolete within the next moment.
Probably the best way to solve this dilemma is to implement a so-called partial-page cache: Let your application cache just portions of the page and leave the rest, where it makes no sense to cache, dynamic.
It's very important that you implement this in a very top layer of your application. Probably exactly that layer, which software architects will call presentation layer. Sure, this is likely to break you framework architecture, but to quote chapter 55 of the Tao Te Ching:
The movement of the Tao
By contraries proceeds;
And weakness marks the course
Of Tao's mighty deeds.
But seriously: If you have to stay in the boundaries of a framework, Ajax is a good way to bypass this restrictions and helps to implement such a cache in a restricted architecture. But be aware that this will raise the number of HTTP requests on your frontend web servers.
An effective caching strategy will always mess your beautifully designed software architecture up. Having just one (central) caching layer looks great in system diagrams and it's better than no cache at all, but it's definitely not the end of the rope.
 Cache complex data structures
If you don't want to break your framework architecture or you don't like the idea of caching HTML at all, and I totally understand your point, you should consider about caching other (lower level, but still complex) structures of data.
Some examples for suitable data structures:
- user profiles
- friends lists
- current user lists
- list of locations, branches, countries, languages, ...
- top 10 (whatever) lists
- public statistical data
The main challenge lies in identifying the most proper data structures. This is no easy task and strongly depends on the kind of web application you run or plan to run. Avoid caching simple data sets, like row-level data from the database. Don't think row-level.
That's the best advice you should keep in mind. (Note to myself: I need to put this on a t-shirt. I found this phrase in Memcached Internals, a wonderful article inspired by a talk by Brian Aker and Alan Kasindorf.)
At a first glance Ajax may be an obvious technology to combine with such a cache. But please be aware that moving application logic away from the server-side application to the client side is always a very dangerous task, which easily may compromise the security of your application.
Which allows me to end this post with another quote from Laozi (Tao Te Ching, chapter 63):
All difficult things in the world
are sure to arise from a previous state
in which they were easy.