One of the often overlooked but critical components of a functioning web application is it’s performance. Not only does it have a dramatic impact on the user experience of a front-end user, it also has implications on the reliability and cost effectiveness of the deployment of the application. This article will cover how to effectively use different caching techniques to improve the performance of a python / Django web application.
Why is caching important?
For several reasons:
- A better user experience. Frustration levels decrease and conversion rates increase as page performance improves.
- SEO. Google and other search engines give preferential ranking to websites that have a lower average response time.
- Cost-effective infrastructure. Utilizing caching to reduce server workload can let you built out a less costly, more reliable web application.
- Slow performance can cause outages. This can’t be stressed enough; if enough requests are stalled in line waiting for a slow page to load, you can have an outage. Load balancers will
start responding with 503 messages and your website will be down.
It almost never makes sense not to cache. In most cases, an invalidation strategy can be created that makes caching transparent. However, some workloads (write heavy and read light) may incur more overhead from caching and invalidation than savings.
Types of cache
There are several different types of caches, at different layers that can be utilized in a python / django web application. Although the list below is not comprehensive, it does cover the most
common caching techniques used.
- Upstream caches: These caches intercept the HTTP request before it is sent to the application server and attempt to service the request from cache before sending it to the back end. Some
upstream caches are sophisticated enough to run complicated logic to determine which pages can be cached and which can’t, or even return different versions of caches depending upon
certain user conditions. Common upstream caches include Varnish, Squid and hosted CDNs like Akamai.
- Downstream caches: Typically a middleware that intercepts a request before it is routed further into the application and attempts to server it from cache. Can be very powerful in that
it’s done in application code, so very sophisticated caching logic can be implemented. It’s downside is that it ties up an application server process to serve a cached page.
- ORM Cache: This is a cache that stores the result of queries made to the ORM and tries to serve them from cache whenever possible. The primary purpose of an ORM cache is to
relieve query pressure from the database. Examples include johnny cache and cachalot.
- Template fragment caching: Caches a chunk of rendered HTML. Built into django core.
- Manual: The built-in caching primitives exposed in django’s caching api. Can be used to cache expensive, repeated computations.
Caching in layers
Although one may think that selecting one or two of the caching techniques above would be enough to ensure a web application remains performant, most of the time, it’s not. For several reasons (detailed below), you should always implement as many caching layers as you can in an application:
- Caches fail: If an entire cache layer fails (memcached, varnish), you don’t want your entire website to go down as a result. Caching in layers is more fault-tolerant.
- Caches miss: When a cache key expires or is invalidated, or a caching layer is restarted, it passes the request through to the backend to generate a new copy of the key. If this operation is not fast enough or never
finishes due to stress, your website will fail. Caching in layers alleviates this problem by making a cache miss at one layer still hit caches on other layers.
- Some workloads miss caches: Let’s assume a web crawler is crawling your site (a good thing, in most cases). Let’s also assume you have a menu with lots of categories on your site. If
that menu is expensive to generate, and you only have Varnish as a front end cache (with a 5 minute or hour long cache time), you may have lots of cache misses as the crawler traverses
your site. If you cache that menu with fragment caching, however, you reduce the rendering time on those other pages; a problem varnish doesn’t solve out of the box.
Handling user-specific content with upstream caches
Now that we’ve covered the general caching strategies and how they should be employed, we’ll do a deeper dive on some of the subtleties you’ll encounter as you implement layered caching.
The first of these is user-specific content and upstream caches. As users hit your site, you want to make sure they have a good cache hit rate. You also want to make sure they aren’t shown other user’s information (from a cached page view). There are a few different ways that you can bypass cache and still maintain a good hit rate for these types of user-specific content on cached pages:
- Edge side includes: A special HTML tag that tells the upstream cache to fetch a particular piece of a page in a separate request. The outer shell of the page (everything but a user’s name, for example) can be pulled from cache and the portion of the page that is user specific can be pulled via a separate request to a page that isn’t allowed to be cached. The upstream cache will insert the user-specific markup into the outer cached template and return the whole response back to the user.
- Varying on cookie: The upstream cache stores a cached version of the page based upon the user’s cookies. Anonymous users with no session cookie all get a cached version of the
page. This is an effective technique for sites who have a high percentage of traffic that is anonymous.
Each of the techniques above have both positive and negative aspects:
- Positives: Best user experience. User gets a complete set of markup on the first page load.
- Negatives: If you have several parts of the page that are different for users, these would all need an ESI tag. Each of those tags generates a request to the backend. ESI is also a
complex mental model; although the backend is getting hit with multiple requests, only a single set of markup is sent to the user. Finally, this requires local development to be done
with an upstream cache present for full testability locally.
- Positives: Easier to understand and develop than ESI (everyone’s done ajax). You can combine information you need to render the page into a single JSON ajax request. Local
development doesn’t need an upstream cache installed to test completely.
- Positives: No development required. Only hits the backend with a single request. No upstream cache needed for local development.
- Negatives: High cache miss rate when traffic isn’t mostly anonymous.
A quick note about upsteam and downstream caching
Nowadays, it really doesn’t make sense to use a downstream (application-code) cache. Upstream caches have feature-rich configuration languages that can handle most scenarios (and if not, just send the appropriate cache headers from the back end). They also have a lot of benefits that downstream caches can’t provide, including shielding an application server from bursts of requests, a grace period on expired caches, and edge side includes. Downstream caches tend to suffer from a dog pile effect (many users requesting the same cache key at the same time and bogging down performance), cached pages still tie up application processes, and this form of caching uses more memory in your caching cluster.
In django, the ORM provides a great layer in which you can inject caching mechanisms across all ORM calls to a database server. Two projects exist currently that do this for you: johnny cache and cachalot. These use a clever mechanism to reduce query overhead to your database server:
- Whenever a write is made to any table, a cache key is set with both the table name and the time of the write. This marks the table as “dirty” for any cached query made before this write.
- Any time a SQL query meets certain conditions (is cacheable), a hash is made of the query and the query, the time it was generated, and the database tables it depends upon are cached
- When a query is about to be run, cache is checked first to see if a cached result exists for that query.
- If a cached result exists, the time the result was generated is compared against all of the tables that the result depends on and the last time they were written to. If the tables have not been modified since the query was cached, the cached result is used and the database backend is not queried. If a table has been modified, the cached result is not used and the query is sent to the database for processing.
What’s nice about ORM caching is it’s plug an play. Simply enabling the ORM caching middleware gains you the performance benefit without having to do any further coding (other than testing). The only think to keep in mind when using a caching engine like this is the additional overhead you incur for tables that are write-heavy; it’s best to “blacklist” these tables if that is the case so the ORM cache doesn’t attempt to cache them. Another benefit of this type of caching is that it’s not time-based upon it’s expiration; if your data never changes, the cache never expires. Certain workloads can benefit greatly from this type of caching.
One thing to keep in mind when using an ORM cache is to make sure that your queries can be cached. Sometimes you can make minor modifications to your application to make queries cacheable that have little to no user impact:
- Round timestamps to the nearest 5 minutes. If you’re checking for something to have expired, most of the time it’s OK if that window is 5 minutes off. This will help like requests in the same 5 minute window to share a cache key, instead of generating a new query because of a difference of seconds or even milliseconds.
- Don’t use certain SQL operators that cannot be cached if you can void it (NOW(), for instance)
- Don’t use random sorting in sql; not only is it bad for caching, it’s also bad for performance for large result sets.
- Optimize expressions whenever possible so cache keys overlap. For example, if SQL is written as X + 5 > 10in one place and X > 5 in another, that will generate two different cache keys.
- Use model methods so SQL queries are the same. If you re-implement sql queries in different views, minor differences in how the query is generated may cause cache misses. It also has the side benefit of improving your code quality.
- Avoid user-defined functions and stored procedures.
- Avoid using subqueries (this also adds a performance benefit in most cases).
Template fragment caching
Template fragment caching is the ability to cache, given a key and possible vary parameters, a chunk of django template:
This can be useful for several reasons:
- Expensive template rendering can be cached between pages so crawlers don’t have a negative impact on performance
- Pages that render the same item more than once on a page (a buy box, for instance) can share the cached versions between the first and second renders.
- Your upstream cache may have a lower cache time (5 minutes, for example) than needed for some content. You can selectively wrap content that can live with a higher cache time in template fragment caching.
A few common use cases where this can be useful include a shared navigation between pages, or a shared header or footer. The django cache tag supports multiple vary parameters as well, which gives you plenty of flexibility when setting up fragment caching. Although it’s good to cache in layers, it doesn’t really make sense to fragment cache entire pages; an upstream cache is more effective for this.
Once you implement template fragment caching, ORM caching, and even native caching techniques into your application, it’s important to make sure that you use the correct cache backend to support your workload. There are several that Django support; we’ve listed a few here with comments around the positives and negatives of each. This list is not comprehensive, but it does cover some of the common backends you may see.
- Memcached: The gold standard for caching. An in-memory service that can return keys at a very fast rate. Not a good choice if your keys are very large in size
- Redis: A good alternative to Memcached when you want to cache very large keys (for example, large chunks of rendered JSON for an API)
- Dynamodb: Another good alternative to Memcached when you want to cache very large keys. Also scales very well with little IT overhead.
- Localmem: Only use for local testing; don’t go into production with this cache type.
- Database: It’s rare that you’ll find a use case where the database caching makes sense. It may be useful for local testing, but otherwise, avoid.
- File system: Can be a trap. Although reading and writing files can be faster than making sql queries, it has some pitfalls. Each cache is local to the application server (not shared), and if you have a lot of cache keys, you can theoretically hit the file system limit for number of files allowed.
- Dummy: A great backend to use for local testing when you want your data changes to be made immediately without caching. Be warned: permanently using dummy caching locally can hide bugs from you until they hit an environment where caching is enabled.
Content delivery networks
Your eyes may glaze over a bit as you read this section, and I don’t blame you. CDN’s are easy to use and have been around for a long time; you probably already know what you’re going to read here. Nevertheless, let’s review why serving your media from a CDN is beneficial to your application’s performance:
- Application server response time and perceived user response time (window loaded event) can differ. If you ignore one in favor of optimizing the other, you can leave the user with a poor experience on your website
- You should always use a CDN to deliver media. You don’t want to tie up your request pools in your caching servers or your application servers to serve static files. Those pools demand lots of memory in an application server, and using them to sever memory is not a good use for them.
- Be sure to combine and minify static files where appropriate to reduce the size and number of network requests a browser has to make to download a page.
An interesting problem you’ll run into when setting up a CDN is how to invalidate caches when those static files have changed. Although you can run a purge with your CDN, it’s often slow and expensive, and definitely not easy to automate. Another option is to use version strings appended to the end of files (either in the filename itself or as a GET parameter) to let the user know when a new version of the file should be downloaded. Some libraries do this automatically for you with MD5 hashing of the files. Be warned: this can be slow, especially if the files have to be read off of a remote file hosting service like S3. Sometimes simplest is best (appending a version string in a GET parameter manually).
Problems caching doesn’t solve
In my experience, developers can take a poorly-written site that isn’t performing well, slap some caching in front of it, load test the caching server and call it a day. This is a recipe for disaster. As we mentioned above, caches fail, caches miss, and even in a perfect scenario, you need to be able to generate a cache key faster than it expires. Here is a more detailed list of problems caching doesn’t solve:
- Writes: writes automatically bypass cache. Sites with heavy write workloads and light read workloads may not benefit from caching. This problem is compounded by the nature of database replication requiring vertical scaling in some cases; caching won’t help you here.
- Slow queries: If a query takes longer (say, 7 minutes) than the cache time (say, 5 minutes), caching isn’t going to help you much. Further, a hard cache miss will still stall the user until the page can be generated; you don’t want the first viewer of a page to have to deal with a 30 second page load time. Finally, slow queries can dog pile and cause resource contention on your database server.
- Third party calls: although you may be able to cache some third party calls, some are not cacheable; they either require user-specific data or have unique tokens in their responses.
- Stupidity: Caching a call to a php bulletin board on your homepage will not make your homepage 100% reliable. That bulletin board will go down, that cache key will miss and your users will stall. Don’t ask me how I know this.
Probably the most important part of implementing a caching solution is to be sure to test it. There are lots of gotchas involved:
- Test with all layers of cache disabled except one, and verify that each layer of caching is working as expected. A caching layer higher up the stream can cover up for a failed layer in cache testing.
- Test with a real workload. Simply fetching the homepage over and over again thousands of times a second doesn’t represent a real workload. Each application server should be hit with a mix of slow, medium and fast running requests to make sure both the caching and application server performance are up to par.
- Be sure your QA environment tests with concurrent user accesses. If a single person is QA testing the site, they probably won’t notice if their user-specific data is cached across all users.
Here is a list of some of the performance testing tools / techniques that you can use to verify your caching:
- ab (apache bench): the simplest to use. Works really great for generating a load on a backend quickly with little set up time
- Jmeter: A java-based took with a GUI that can run rest cases, graph results, and even run a series of web requests as a transaction and log the performance of each and the total time
- Memcached / varnish statistics: Monitor your hit rate to make sure a high percentage of requests are being served from cache. Also monitor your eviction rate; if it is high, you might not have enough space allocated to your cache pool.
- New relic: a third party tool that can monitor performance with detailed traces
- Hire a third party: If the project is large enough, hire a third party to test your application’s performance. For high-impact websites, it’s better to do this than wait for user load to uncover potential performance issues.
Another look at Varnish
While we’re talking about caching, it’s probably worth taking a deeper dive into some of the features that a particular upstream cache, Varnish, offers:
- VCL: VCL is a fully-functional programming language that you can use inside of a varnish configuration file to perform sophisticated caching logic. Examples include modifying the user request to strip out irrelevant cookies or routing to different backend server pools depending upon server status or request path.
- Load balancing and failover: Varnish is also a fully-functional load balancer with failover. You can even assign weighted priorities to backends, and there are different request queues implemented (default is round robin).
- Grace period: If an expired cache value is still around, sometimes you want to serve it while you fetch new content so the user doens’t have to wait on the new cache key to be generated. Varnish calls this period a “grace” period. It’s also useful if your backend goes down; you can configure Varnish to continue serving stale pages until your backend has recovered.
- Dog-piling: Varnish combines like requests to the backend so that requests do not dog pile. For instance, if your homepage cache has expired and 500 users simultaneously visit your homepage, Varnish only sends one request to the backend and serves it to all 500 users once it has finished.
There are several factors that go into a performant web application, and one of the most important ones is caching. There are a variety of caching techniques and tools that you can utilize to get the most out of your web application and to provide a user with a fast, responsive experience. If you have any questions or need any help with a project of your own, be sure to contact us!