Stop. Caching. Static. Files.

I maintain a Varnish Configuration Repository that everyone can commit to. It's meant to be a good base to start any kind of Varnish implementation. I use it as my personal Varnish Boilerplate. And I personally have always disliked configurations such as these:

# Remove all cookies for static files
if (req.url ~ "^[^?]*\.(css|jpg|js|gif|png|xml|flv|gz|txt|...)(\?.*)?$") {
  unset req.http.Cookie;
  return (lookup);
}

The snippet above will strip all cookies from static files causing them to be cached by default. Why do I dislike this? Because it seems like this is causing people to cheer at their 99% cache hitrate but is causing more cache evictions because their memory is so limited. The only moment you should ever cache static file is if you have memory to spare.

Static files do not cause load.

Sure, they cause disk access. Not just for reading the file, but for logging their entry in the webserver logs as well (if you have not excluded caching static content from your server logs). And your webserver needs to send the file back to the client.

But in all likeliness, your OS probably has your most frequent static files in their buffer meaning no disk access is required to serve the file. And your webserver should simply be good at serving static files, or you should consider switching (to, say, lighttpd or nginx).

You're staring blindly at the cache rate

If a default config caches all static files without exception, you'll see high cache hitrates. You (and your client) will be happy. And you'll be mislead in thinking that the performance bottlenecks have been solved. It's not the static files that cause your webserver load, it's the PHP/Ruby/Perl/... scripts that are hitting your database and are calling external API's. Those are the requests that need to be cached, not a static file that's not causing any CPU load.

Proceed with spare memory

Therefore, focus should be given to the dynamic pages causing server load. Those cached requests actually make a difference. Then, if you have spare memory, open up some static file requests to be cached. And continue to monitor your cache size, because if static files are causing heavy PHP pages to be evicted from memory, you will only hurt your performance further.

Nearly no one has the capacity to store all their static content in a Varnish memory cache. Don't sacrifice a 5ms static file request for a 500ms PHP processing script. I've commented all entries that cause static files to be cached explaining this, and I hope that people don't blindly copy the config templates but read through them and think about this.

Hi! This is a blog where I write down solutions to some of the problems I've faced when working as a sysadmin/dev. I hope you find the information shown here useful to you. Please use the comments on this blog to give feedback about the content!. I'm @mattiasgeniar on Twitter.

Tagged with:
Posted in Devops, Varnish
4 comments on “Stop. Caching. Static. Files.
  1. Well … although not caching static resources on Varnish makes sense (although in that case it would be even better not to use Varnish at all for static files, having Apache serve those on static.domain.be), stripping cookies from images/ css/ js and especially allowing those to be cached on browser level is an absolute must from a page load/rendering performance perspective (because static files might not generate CPU-load, but they do generate a shit-load of network traffic).

    So the question is; don’t your changes impact page-load performance in the browser?

  2. michaelb says:

    I’m not too familiar with Varnish but why limit yourself to one cache (I’m talking parallel here)? Simply maintain one cache for pure static resources and other one for the rest. This would give you two metrics and I doubt that there should be any significant overhead.

    “But in all likeliness, your OS probably has your most frequent static files in their buffer meaning no disk access is required to serve the file.” If your caching them, elsewhere in memory and reading from there it wouldn’t get frequently accessed from disk so it should get evicted from that cache or not be added to the cache at all. So on the surface both methods look the same, however the disk cache cannot make assertions on the future usage pattern of the files. Other then making guesses based upon previous access. However the application can in some cases make assertions on what the future usage pattern will be. Which could lead to earlier evictions or resources to be cached sooner, also benefiting the first group of requests.

  3. One question, do distributing the images and static content from sub domain help to load page faster? And i am using wordpress so the plugin i use to cache the content for almost all my blogs is Total cache, do it caches static files if yes how i can disable the same ? Sorry but i am not a freak & thanks for the tip.

    • Bert-Jan says:

      Yes that helps a lot, because those subdomains don’t have any cookies (session, etc.) to begin with, and are therefore a prime target for varnish caching.
      Keep in mind though that when you use Google Analytics, it’s cookie(s) are for the entire domain, including subdomains. Use setDomainName(‘www.domain.com’) to force it to only set cookies for the www part. It’ll leave the subdomains alone, your browser won’t have any cookies there and varnish will start caching stuff without even having to change it’s default config ! (no cookie stripping needed).
      Also, static subdomains increase page speed because browsers will request more resources in parallel.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>