Browser Caching and Related HTTP Headers

Surjendu PalSurjendu Pal
4 min read

Caching is a important mechanism to improve performance and optimize resources in any application. Cache is a software or hardware component where data is stored temporarily for faster future access. The data can be any serializable data(image, html, css, javascript, strings etc.). There are many type of caching strategies. Browser caching is one of them.

Browser cache is a small database on the client side which stores the downloaded content (image, scripts, html, css files) by an web page. For Chrome on windows this data looks like this

Basically, when the browser request for some content to web server, the request directly goes to web server, but if the browser caching is present then the browser checks the cache first. If the browser found the data in cache and the data is valid, it is directly served from the cache. The request do not go to web server.

Browser caching can be implemented by the HTTP headers. The header instruct the web browser when to cache the resources, when not, for how much time it will cache, when to validate.

For example, see this request for the image is served form browser cache.

Caching Headers

The instruction given to the browser about when to cache and when not, for how long particular data will be cached, when to validate the cached data is by HTTP headers.

Cache-Control

This header was introduced in HTTP/1.1 and is considered as the most modern implementation. it comes with several different values depending on which how you want the browser to behave. These are the different Cache-Control directives.

  • public

    This tells the web browser that the data can be cached by browser and any other intermediate caching systems (CDNs, proxy servers). This generally used for static and non-sensitive contents like scrips, css files.

  • private

    Making the cache control private , tells the browser that do not cache it in any intermediatory parties, This data only be cached in the client browser. This is done for client specific sensitive data so that the intermediate services (CDNs, proxies) do not cache that.

  • no-cache

    This is kind of tricky!! no-cache does not means do not cache data. Instead it tells the browser that the do not refer to the cache immediately, first validate the content against the server and if they are same then serve from cache.

  • max-age

    This refers to how much time in seconds the content will be valid before the client needs to revalidate it. note, this indicate the relative time of validity starts from the time content cached.

  • s-maxage

    This works same as max-age but it is only used for intermediate cache servers (CDNs, proxies).

  • must-revalidate

    This attribute forces the browser to revalidate the content every time we need it, instead of just serving it from cache. This is useful when the network interruption occurs.

  • proxy-revalidate

    Works same as must-revalidate but for intermediate servers.

  • no-transform

    Tells the browser not to transform the content received from the server in any way. This is handy for compressions.

The cache control directives in Cache-Control header can be defined together. like Cache-Control : public, max-age=600

Etag

This is a response header used to identify specific resource. When ever certain resource change the etag value for the resource changes. Generally when serving the content, the etag is the content file's hash value. If the files changes the hash also changes.

This way the browser get to know that the content has been modified. Some of the build system assign a random hash value in the static content file name on each build. This way when next time the server request the content bundle, the browser get to know which contents has been modified and which are not. This saves bandwidth a lot. The etag enabled by default in some server like Nginx, Apache.

Expires

This is pretty old header and not used in modern systems. This is introduced in HTTP/1.0 and defines the absolute expiration time of the content.

Pragma

This is also introduced in HTTP/1.0 and outdated now. This is mostly use for backwards compatibility. Using Pragma: no-cache works same as Cache-Control: no-cache.

Last modified

This header is used to show that the origin server believes that the content is last modified.

These are some most used HTTP headers for browser caching. Here is more about these headers 👉

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Expires

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Last-Modified

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Pragma

Like ❤️ and comment your thoughts on this !!

1
Subscribe to my newsletter

Read articles from Surjendu Pal directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Surjendu Pal
Surjendu Pal