Caching a glimpse of the future

Randall Naar
8 min readAug 25, 2020

The importance of web caching and how old http responses can mirror the http responses of the future.

Introduction

Have you ever wondered why a slow webpage always seems to load right when you close or refresh the tab? I’m sure there’s a very explainable answer to that phenomenon but I unfortunately won’t be writing about that today. What I will be writing about is why the page loads so fast after the refresh, even when it seems to take forever the first time around! It’s all thanks to web caching, a lesser known concept that plays a vital role in how the modern web works today.

The definition of web caching can be inferred by the original definition of caching — “to store away in hiding or for future use.” So what exactly are we storing away for “future use” here? Nothing but the most important part of the web — http responses! Why should your browser make a new request for google.com every time you type it into the address bar when most of the time the server will respond with the same Google homepage? Luckily someone else saw how useless it was for the browser to repeatedly ask for the same content and proposed a whole section in the http 1.1 RFC (RFC 2616) that championed the caching and re-use of these http responses.

Why Web Caching Works

The basis for the existence of an http cache revolves around the stateless nature of the web and how some http requests will cause the same http responses to be given almost every time. Looking at the architecture of modern websites, you might debate this though. And you’d be right to question how applications like Facebook and Twitter would take advantage of this refresh “speed boost” when those platforms have data that changes all the time.

Truth be told, there are some difficulties caching dynamic content (as will be discussed later), but in any given web site there is typically more to its caching scheme than just the data-filled html page itself. In just one web page there could be dozens of images, scripts, and stylesheets retrieved and utilized before the user even sets eyes on the rendered webpage. Because not all of these resources need to be altered with the change of data (how many times does the image for the Facebook logo change?), some of these resources can be cached.

In fact, a developer focused on page render time might abandon server-side rendering of data altogether in favor of client-side Ajax requests. This is so the html pages and accompanying scripts (that will eventually retrieve the data) can remain static, making it quicker to serve over the network and eligible for browser caching. Serving the static html pages and stylesheets without data allows an empty page to be rendered quicker, shortening the time it takes from when you type the url into your browser window to seeing the page (there is a much deeper debate between whether one should use server side rendering or client side retrieval but I’ll leave that argument for a later post).

Understanding the limitations of what we can do, let’s look more into the problem of caching dynamic data and how we can deal with data that’s constantly changing. If you think about it most websites fall into this category, where information offered on their webpages are bound to change. How many websites have kept their webpages the same from its first launch? From this perspective, the amount of ‘static’ data is actually very small.

If all data is bound to change at some point, how can we cache data in a way to make sure that it’ll get updated when changes are made? The answer is cache invalidation.

There are only two hard things in Computer Science: cache invalidation and naming things.”

— Phil Karlton

Cache invalidation is the process of removing things from the cache. Referencing the quote above, cache invalidation is typically hard because there’s no surefire way o determine when something stored is no longer valid. Luckily http caching has a specification to explain how a cache can be invalidated so developers have to work within that framework to decide when to invalidate the cache. In lay terms, the http 1.1 rfc says things should be removed from cache based on time based expiration dates sent with http responses or by the client making the http requests.

Caching In Practice

To simplify the design of http web caching, it was designed to only cache the responses of requests with “safe” request methods. This means that request methods that don’t modify data on the origin server can have their response cached. Unsafe request methods that do modify data on the origin server (such as “POST”, “PUT”, and “DELETE”) can’t have their responses cached.

Additionally, successful responses to unsafe requests should invalidate all cached content related to the uri associated with the request. This is because the data at that endpoint isn’t ensured to be the same as when it was originally cached. Nevertheless, in the case where safe http methods are used it’s always best to specify the desired behavior of a web cache using a http header labeled ‘Cache-Control’ and its accompanying set of http header values called cache-directives.

These directives are really targeted towards the browser and allow client-side code to communicate whether an http response should be cached . Just like how an http request-maker can decide whether they’d like a response to be cached on a user’s browser, an origin server can also specify a desire for browser caching by adding ‘Cache-Control’ headers to its responses. A cache directive placed on the header of a request need not be replicated in the response so caching behavior can be overridden by the server’s own ‘Cache-Control’ header.

So, now that you know what cache directives are, the next question should be “how do we use them?”

The typical format for a ‘Cache-Control’ http header looks like this:

"Cache-Control" ":" 1#cache-directive

I know it looks like I’m messing with you, but this is actually how it’s defined in the RFC! Early RFCs used a special notation to specify string patterns and the notation used here is called BNF (it’s elaborated on in RFC822). To save you the trouble of deciphering what 1#cache-directive means, the ‘#’ means there must be a comma separated list and the ‘1’ to the left of the ‘#’ means there should be at least 1 cache directive in the list.

The RFC specification for cache directives looks like this:

cache-request-directive =
"no-cache" ; Section 14.9.1
| "no-store" ; Section 14.9.2
| "max-age" "=" delta-seconds ; Section 14.9.3, 14.9.4
| "max-stale" [ "=" delta-seconds ] ; Section 14.9.3
| "min-fresh" "=" delta-seconds ; Section 14.9.3
| "no-transform" ; Section 14.9.5
| "only-if-cached" ; Section 14.9.4
| cache-extension ; Section 14.9.6

cache-response-directive =
"public" ; Section 14.9.1
| "private" [ "=" <"> 1#field-name <"> ] ; Section 14.9.1
| "no-cache" [ "=" <"> 1#field-name <"> ]; Section 14.9.1
| "no-store" ; Section 14.9.2
| "no-transform" ; Section 14.9.5
| "must-revalidate" ; Section 14.9.4
| "proxy-revalidate" ; Section 14.9.4
| "max-age" "=" delta-seconds ; Section 14.9.3
| "s-maxage" "=" delta-seconds ; Section 14.9.3
| cache-extension ; Section 14.9.6

cache-extension = token [ "=" ( token | quoted-string ) ]

There’s quite a few on that list with lots of other notation specific symbols, but I won’t be explaining all of them right now since I only want to focus on a few cache-directives — “max-age”, “max-stale”, and “no-cache”. The “max-age” directive is used to set the lifetime of the cached response, The “max-stale” directive is used to set an acceptable range of time where responses older than the lifetime can be used, and the “no-cache” directive is used to set the disablement of caching the specified resource. Before we dive deeper into what those mean I want to clarify a few things first.

A value listed under ‘cache-request-directive’ is used as a value to send caching information from your client side code making requests, to the browser. A value listed under ‘cache-response-directive’ is used to send caching information from the server to your browser.

For any user-agent to implement a web cache, it should have some sort of mechanism to record both the time a response came in, and a stored response’s age relative to the responses generation time.

With that, I’ll explain the first of our cache-directives, ‘max-age’. The ‘max-age’ cache directive is set on the http request header in the following format (where ‘x’ is a positive number representing time in seconds):

Cache-Control: max-age=x

This header is used to tell the browser from the client side (and also from the server side) how long the http response for that request should be stored. Alternatively, a server can tell the browser how long it should cache a response by using a completely different header (*not* a cache-directive) named ‘Expires’.

We say a cached response is “fresh” if the generation time summed with the age (the number of seconds the response has been cached for) is either less than the ‘Expires’ header sent with the response or if the age is less than the max-age header on the response. If neither the ‘Expires’ or “Cache-Control: max-age=x” header is set, the cache can try to use other ways to determine the freshness with other header fields like ‘Last-Modified’.

The second of our cache-directives is ‘max-stale’. The ‘max-stale’ cache directive is set on the http request header in the following format (where ‘x’ is a number representing seconds):

Cache-Control: max-stale=x

This cache directive can only be sent from the client.

‘Stale’ responses are those that outlive their expiration. The ‘max-stale’ cache-directive says expired responses (determined by using the ‘Expires’ header or the ‘max-age’ cache-directive) can be used for up to ‘x’ seconds after expiration, pretty much at the discretion of the browser. Even then, a ‘warning’ header field with the value of ‘110 — “Response is Stale”’ should be sent.

All clients can technically serve stale content if not explicitly told not to under certain circumstances without the use of a ‘max-stale’ cache directive, but if max-stale is placed on the request, a client can serve a response that’s stale even in the case of a server that is not down. One benefit of using max-stale is that it limits how old a stale response can be sent when a response cannot be obtained.

Finally, the ‘no-cache’ cache-directive looks like this:

Cache-Control: no-cache

This directive can be set by the client or the server and signifies to the browser that the response to the http request containing the following cache-directive should never be cached.

What Happens When Cached Http Responses Expire?

One thing that could happen is the response is just removed from cache. An alternative to that is having the response undergo ‘revalidation’.

In cases where a response needs to get revalidated, an http client will send ‘pre-condition’ header-fields that represent conditions asking a server to to send a 304 unmodified status code when the condition is met, or to evaluate the request if the condition is not met. All cached responses of that resource must have their max-age updated upon receiving a response with a new max-age cache directive and a 304 status code.

The preconditions that can be set deal with sending headers like ‘If-Modified-Since’ or ‘If-Unmodified-Since’, ‘If-None-Match’, ‘If-Match’, or ‘If-Range’. More information might be written up about these headers in a future blog post, but for now I think I’ve packed enough information to get you started on your quest to learn about http caching.

I hope you’ve learned a thing or to about browser caching! To learn more, I recommend taking a look at section 13 of RFC 2616. Also feel free to leave a comment suggesting what other content I should write about in the future. Thanks for reading, until next time!

--

--