5. HTTP Caching
Web pages often contain content that remains unchanged for long periods of time. For example, an image containing a company logo may be used without modification for many years. It is wasteful in terms of bandwidth and round trips to repeatedly download images or other content that is not regularly updated.
HTTP supports caching so that content can be stored locally by the browser and reused when required. Of course, some types of data such as stock prices and weather forecasts are frequently changed and it is important that the browser does not display stale versions of these resources. By carefully controlling caching, it is possible to reuse static content and prevent the storage of dynamic data.
Browser caching is controlled by the use of the Cache-Control, Last-Modified and Expires response headers.
5.1 Preventing Caching
Servers set the Cache-Control response header to no-cache to indicate that content should not be cached by the browser:
Also, the Pragma header is also often used to stop caching by HTTP 1.0 proxies as they do not support the Cache-Control header:
For a full description of Cache-Control and the other values it supports please consult the HTTP 1.1 specification.
5.2 Allowing Caching
The Cache-Control header can be set to one of the following values to allow caching:
If the Cache-Control header is not set, then any cache may store the content.
The content is intended for use by a single user and should only be cached locally in the browser.
The content may be cached in public caches (e.g. shared proxies) and private browser caches.
If the browser is to make effective use of cached content, two extra pieces of information should be supplied. The first is the modification date/time of the content. The server supplies this in the Last-Modified response header:
Last-Modified: Wed, 25 Feb 2015 12:00:00 GMT
The browser keeps this value with the cached entry so that it can check the server for changes when a page is first visited in a browser session or the user requests a page update (e.g. presses F5 in IE).
The second piece of information is the expiration date, that is specified with the Expires header:
Expires: Thu, 25 Feb 2016 12:00:00 GMT
If a cached entry has a valid expiration date the browser can reuse the content without having to contact the server at all when a page or site is revisited. This greatly reduces the number of network round trips for frequently visited pages. For example, the Google logo is set to expire in one year and will only be downloaded during that year on your first visit to google.com or if you have emptied your browser cache. If they ever want to change the image they can use a different image file name or path.
The HTTP specification recommends using expiration dates no more than one year in the future. An equivalent to setting the Expires header is to use a max-age value with the Cache-Control header. Often this is easier than calculating the expiration date as you specify the cache expiration as a delta from the current time in seconds. For example this header will sets the cache expiration to be 31536000 seconds or one year in the future:
5.3 Cache Validation and the 304 response
There are a number of situations in which Internet Explorer needs to check whether a cached entry is valid:
- The cached entry has no expiration date and the content is being accessed for the first time in a browser session
- The cached entry has an expiration date but it has expired
- The user has requested a page update by clicking the Refresh button or pressing F5
If the cached entry has a last modification date, IE sends it in the If-Modified-Since header of a GET request message:
GET /images/logo.gif HTTP/1.1 Accept: */* Referer: http://www.google.com/ Accept-Encoding: gzip, deflate If-Modified-Since: Wed, 25 Feb 2015 17:42:04 GMT User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko Host: www.google.com
The server checks the If-Modified-Since header and responds accordingly. If the content has not been changed since the date/time specified, it replies with a status code of 304 and a response message that just contains headers:
HTTP/1.1 304 Not Modified Content-Type: text/html Date: Thu, 26 Feb 2015 10:00:04 GMT
The response can be quickly downloaded because it contains no content and causes IE to read the data it requires from the cache. In effect, it is like a redirection to the local browser cache (See 7. Redirection).
If the requested object has actually changed since the date/time in the If-Modified-Since header, the server responses with a status code of 200 and supplies the modified version of the resource.
The images in this example demonstrate different levels of caching. It is worth trying the following actions in Internet Explorer to investigate how well these images are cached:
- Try using the back and forward buttons
- Refreshing the page
- Opening a new instance of IE to create a new browser session and re-visiting this page
|Image A:||This image is never cached and is always downloaded; even with back/forward buttons.|
|Image B:||This image can be cached but has no expiration or modification date. Therefore it is always downloaded when the page is first visited in a new browser session or if the user refreshes the page.|
|Image C:||This image can be cached and has a modification date but no expiration date. Therefore it is always checked but not downloaded when the page is first visited in a browser session or if the user refreshes the page.|
|Image D:||This image can be cached and has an expiration date set to on year in the future use max-age. The browser can reuse the image in a new browser session without having to send any request to the server. It can always be re-read from cache unless the cache is cleared, or the user requests a forced update with Ctrl + F5.|
Using HttpWatch with Example 5
To view the HTTP headers discussed on this page:
- Open HttpWatch by right clicking on the web page and selecting HttpWatch from the context menu
- Click on Record to start logging requests in HttpWatch
- Optional: You can add a filter to capture only the images in example 5 by adding a 'URL contains' condition with the value "caching/image".
- Try refreshing this page, using the back/forward buttons and starting a new browser session by opening a new instance of IE
- You can see the effect of a request on the browser cache by selecting the Cache tab. Also, the value of the Cache-control, Expires, Last-Modified and If-Modified-Since headers can be seen on the Headers tab.
- If an entry is read from the cache and no request is sent to the server, the Result column will show (Cache) and the Size column will show zero.