Gyong Ju - South Korea

Lies, damned lies and caching

Posted: October 2, 2009 at 6:47 pm

One of the most excellent aspects of HTTP and the underpinning REST architecture is the aspect of idempotence and non-idempotence. Idempotence roughly means that some operation yields the same result whether it is done only once or several times. This behaviour forms the basis of caching because if the result is the same you can work with a copy of the result for all subsequent requests. This excellent property of the web also has some downsides as the Law of Preservation of Complexity demands that for any goodness in technology there always is a trade off in increased complexity somewhere else. In the case of caching it is determining when to cache and when not to cache.

The website that is the topic of this entry is one of the largest banking websites of Europe and it is based on IIS 5 which doesn’t do a lot out-of- the -box caching wise. Problem is that if your webserver doesn’t supply any information about the validity of the objects it serves you will find out fairly quickly that many parties to the conversation will cache each object quite agressively. The browser is one of these parties but also many caching proxies that are used by the target infrastructure or your ISP without you knowing it. The only way of forcing all intermediate parties to relinquish their hold on the page objects is to change the names of the objects (filenames for clarity).

A better aproach is to use cache control to specify the time-to-live (TTL) for each page object. This way the browser and all intermediate parties know how to treat the page objects. When the TTL for a given object has expired the browser (or intermediate proxy) will inquire at the target server if the object is still valid. If so the web server will respond with a 304 Not Modified instead of retransmitting the page object saving valuable bandwidth.

However as indicated the IIS 5 webserver doesn’t have a lot of easily accessible cache settings available to it. There are some third party modules that you can use for this stuff but we’re talking about a website with upwards of 2 million page views per day and you can’t just slap in any ISAPI filter you’d like and trust that everything will turn out right (not to speak of all the test work involved when introducing an additional component in your infrastructure). One of the few components that I did use on the website (totally unrelated to the subject) was ISAPI Rewrite from Helicon Tech and it never failed us through the most extreme loads. I’m usually loath to promote vendors but these guys offer amazing value for money and offered first class support even though we paid a bargain price for their software.

Ok enough with the product talk, back to the subject of applying a caching policy to IIS. After reading up on the subject I decided to create a VBScript script for injecting the proper caching instructions in the IIS Metabase. This way the support team could apply the same policy to all machines without the risk of manual error. The script is based on various scrapings found around the web:

Option Explicit
Call Main

Sub Main
	Dim RootDir, CSSDir, JSDir, ImagesDir, JavascriptDir, CSSFile, JSFile, ServerName, SiteNumber, oArgs, iArgNum,
	' ServerName is default at localhost but can be changed
	' via the -s "SERVERNAME" switch
	ServerName = "localhost"
	' SiteNumber is set at 1, the default website in IIS
	' this can be changed via the -i "SITENUMBER" switch
	SiteNumber = "1"
	Set oArgs = WScript.Arguments
	iArgNum = 0
	While iArgNum < oArgs.Count
		Select Case LCase(oArgs(iArgNum))
			Case "-s","--ServerName":
				iArgNum = iArgNum + 1
				ServerName = oArgs(iArgNum)
			Case "-i","--SiteNumber":
				iArgNum = iArgNum + 1
				SiteNumber = oArgs(iArgNum)
			Case "-?","--help":
				Call DisplayUsage
				WScript.Quit(1)
			Case Else:
				WScript.Echo "Unknown argument " & oArgs(iArgNum)
				Call DisplayUsage
				WScript.Quit(1)
		End Select
		iArgNum = iArgNum + 1
	Wend
	WScript.Echo "ServerName: " & ServerName
	WScript.Echo "SiteNumber: " & SiteNumber
	WScript.Echo "Default caching policy for the entire site is caching for 2 hours (7200 seconds)"
	Set RootDir = GetRootDir(ServerName, SiteNumber)
	Call SetVDirCacheability(RootDir, 7200)
	WScript.Echo "Caching policy for /images is caching for 1 month (2592000 seconds)"
	Call GetVDirAndSetCacheability("images", ServerName, SiteNumber, 2592000)
	WScript.Echo "Caching policy for /javascript is caching for 1 week (604800 seconds)"
	Call GetVDirAndSetCacheability("javascript", ServerName, SiteNumber, 604800)
	WScript.Echo "Caching policy for /css is caching for 1 week (604800 seconds)"
	Set CSSDir = GetVDirAndSetCacheability("css", ServerName, SiteNumber, 604800)
	WScript.Echo "Caching policy for /javascript is caching for 1 week (604800 seconds)"
	Set JSDir = GetVDirAndSetCacheability("javascript", ServerName, SiteNumber, 604800)
	WScript.Echo "/css/super.css must not be cached! Exception policy will be applied ..."
	On Error Resume Next
	Set CSSFile = GetObject("IIS://" & ServerName & "/w3svc/" & SiteNumber & "/root/css/super.css")
	If (Err.Number <> 0) Then
		WScript.Echo "/css/super.css file does not exist yet in metabase, status code " & Err.Number
		Err.Clear
		Set CSSFile = CSSDir.Create("IIsWebFile", "super.css")
		If (Err.Number <> 0) Then
			Wscript.Echo "/css/super.css object in metabase could not be created, error code " & Err.Number
			Wscript.Quit
		Else
			WScript.Echo "/css/super.css object created in metabase and cache policy correctly applied"
		End If
		CSSFile.SetInfo
	Else
		WScript.Echo "/css/super.css object already exists in metabase"
	End If
	Call SetCacheability(CSSFile, 3600, "must-revalidate")
	WScript.Echo "/javascript/include.js must not be cached! Exception policy will be applied ..."
	On Error Resume Next
	Set JSFile = GetObject("IIS://" & ServerName & "/w3svc/" & SiteNumber & "/root/javascript/include.js")
	If (Err.Number <> 0) Then
		WScript.Echo "/javascript/include.js file does not exist yet in metabase, status code " & Err.Number
		Err.Clear
		Set JSFile = JSDir.Create("IIsWebFile", "include.js")
		If (Err.Number <> 0) Then
			Wscript.Echo "/javascript/include.js object in metabase could not be created, error code " & Err.Number
			Wscript.Quit
		Else
			WScript.Echo "/javascript/include.js object created in metabase and cache policy correctly applied"
		End If
		JSFile.SetInfo
	Else
		WScript.Echo "/javascript/include.js object already exists in metabase"
	End If
	Call SetCacheability(JSFile, 3600, "must-revalidate")
	WScript.Echo ""
	WScript.Echo ""
	Call SetHTMLCharSet(ServerName, SiteNumber)
End Sub

Function GetRootDir(ServerName, SiteNumber)
	Dim RootDir
	On Error Resume Next
	Set RootDir = GetObject("IIS://" & ServerName & "/w3svc/" & SiteNumber & "/root")
	If (Err.Number <> 0) Then
		WScript.Echo "Root dir does not exist for site index " & SiteNumber & ", error code " & Err.Number
		WScript.Quit
	Else
		WScript.Echo "Root dir found for site index " & SiteNumber
	End If
	Set GetRootDir = RootDir
End Function

Function SetVDirCacheability(IISObject, CacheTimeInSeconds)
	Call SetCacheability(IISObject, CacheTimeInSeconds, "must-revalidate")
End Function

Function SetCacheability(IISObject, CacheTimeInSeconds, CacheDirective)
	On Error Resume Next
	IISObject.CacheControlCustom = "max-age=" & CacheTimeInSeconds & "," & CacheDirective
	If (Err.Number <> 0) Then
		WScript.Echo "CacheControlCustom property set failed, error code: " & Err.Number
		WScript.Quit
	Else
		WScript.Echo "CacheControlCustom property set: max-age=" & CacheTimeInSeconds & "," & CacheDirective
	End If
	IISObject.SetInfo
End Function

Function GetVDirEvenIfNotExists(DirName, ServerName, SiteNumber)
	Dim Dir
	Dim RootDir
	On Error Resume Next
	Set Dir = GetObject("IIS://" & ServerName & "/w3svc/" & SiteNumber & "/root/" & DirName)
	If (Err.Number <> 0) Then
		WScript.Echo "VDir " & DirName & " does not exist yet"
		Set RootDir = GetRootDir(ServerName, SiteNumber)
		Set Dir = RootDir.Create("IIsWebVirtualDir", DirName)
		If (Err.Number <> 0) Then
			WScript.Echo "VDir " & DirName & " could not be created, error code " & Err.Number
			WScript.Quit
		Else
			WScript.Echo "VDir " & DirName & " created"
		End If
		Dir.SetInfo
	Else
		WScript.Echo "VDir " & DirName & " already exists"
	End If
	Set GetVDirEvenIfNotExists = Dir
End Function

Function GetVDirAndSetCacheability(DirName, ServerName, SiteNumber, CacheTimeInSeconds)
	Dim Dir
	Set Dir = GetVDirEvenIfNotExists(DirName, ServerName, SiteNumber)
	Call SetVDirCacheability(Dir, CacheTimeInSeconds)
	Set GetVDirAndSetCacheability = Dir
End Function

Sub SetHTMLCharSet(ServerName, SiteNumber)
	Dim strExtension, strMimeType
	strExtension = ".htm"
	strMimeType = "text/html; charset=utf-8"
	AddTypeToIIS ServerName, strExtension, strMimeType
	strExtension = ".html"
	strMimeType = "text/html; charset=utf-8"
	AddTypeToIIS ServerName, strExtension, strMimeType
End Sub

Sub AddTypeToIIS(ServerName, varMimeExt, varMimeTyp)
	Dim boolFound, intCount, intMimeMap, objMimeMap, varMimeMap, i, MMItem, aMimeMapNew()
	Const ADS_PROPERTY_UPDATE = 2
	' create the ADSI object & current MIME map at that path
	Set objMimeMap = GetObject("IIS://" & ServerName & "/MimeMap")
	varMimeMap = objMimeMap.GetEx("MimeMap")
	'Delete a mapping by copying to a new map array.
	i = 0
	For Each MMItem in varMimeMap
		If MMItem.Extension <> varMimeExt Then
			Redim Preserve aMimeMapNew(i)
			Set aMimeMapNew(i) = CreateObject("MimeMap")
			aMimeMapNew(i).Extension = MMItem.Extension
			aMimeMapNew(i).MimeType = MMItem.MimeType
			i = i + 1
		Else
			WScript.Echo "Mime type " & MMItem.Extension & " excluded"
		End If
	Next
	objMimeMap.PutEx ADS_PROPERTY_UPDATE, "MimeMap", aMimeMapNew
	objMimeMap.SetInfo
	' get the MIME map count
	intMimeMap = UBound(aMimeMapNew) + 1
	' if no extension information is found, create the new mapping
	ReDim Preserve aMimeMapNew(intMimeMap)
	Set aMimeMapNew(intMimeMap) = CreateObject("MimeMap")
	aMimeMapNew(intMimeMap).Extension = varMimeExt
	' store the new information in the MIME map
	aMimeMapNew(intMimeMap).MimeType = varMimeTyp
	objMimeMap.PutEx ADS_PROPERTY_UPDATE, "MimeMap", aMimeMapNew
	objMimeMap.SetInfo
End Sub

Sub DisplayUsage
	WScript.Echo "Usage: iis_caching_policy.vbs <-s|--ServerName ""SERVERNAME""> <-i|--SiteNumber ""SITENUMBER""> [-?|--Help]"
	WScript.Echo ""
	WScript.Echo "SERVERNAME Optional - The name of the server, default is localhost"
	WScript.Echo "SITENUMBER   Optional - The sitenumber of the affected site, default is 1"
	WScript.Echo ""
	WScript.Echo "Example: iis_caching_policy.vbs -s tracker -i 3"
	WScript.Quit (1)
End Sub

If you look closely at the script (not too close, it’s not a major work of art) you can see there’s an exception for two files: super.css and include.js. The reason for this is that all pages are published as static pages from a Content Management System. Static pages deliver the optimum trade off in speed versus processing power. The problem with static pages, idempotent behavior and caching however is that if you do your work very well hardly any updates will get over to the customer unless you change filenames. This is rather unwieldy with a large website as this would require continuous republication of the website when you want to deliver new javascript or css files. The trick I applied was to have a “super” CSS file that includes all other CSS files. The super CSS file has a very limited time span (1 hour), as demonstrated by the VBScript, so you can update the included CSS files by changing their filenames and the client will notice the updated CSS files after the super CSS file has expired after one hour without having to republish all web pages. The same trick applies to the Javascript files.

Oh, by the way this stuff is a breeze to setup in Apache but sometimes you have to work with the stuff at hand.

No related posts.

Comments

No one has said anything yet.

Leave a Comment

Performance Optimization WordPress Plugins by W3 EDGE