Memory-sensitive Caching for CF

After my introduction to caching post, Doug Hughes asked whether it would be possible to wrap CFCs in soft references to create a memory-sensitive cache for use in CFML. I answered that it should, in theory, be possible, but that an implementation would have to take care of a few common design issues that occur when dealing with soft references.

Having said that, I sat down to write my own implementation to fill the dull hours between CFUNITED sessions…

Here’s softcache.cfc, a soft reference cache that you can drop into your applications. I have not tested this extensively; which is not to say that I haven’t tested it at all! If you do find any bugs, leave your comments here, and I’ll see what I can do about them. I’ve tried to comment the code as thoroughly as possible, but if there’s any particular point that is unclear, do leave a comment and I’ll try to explain it further.

Disclaimer: I spend most of my time writing Java code - if the CFML I write could be optimized further, do let me know.

Keep in mind that the only thing controlling the cache size is the garbage collector. Depending on its implementation, it may decide to be aggressive and free up cache memory even when there’s plenty of free memory in the JVM; or it may be nice and let the cache grow until there really is a memory crisis. Either way, you’re guaranteed that the cache will not run amok and eat all the JVM memory.

… with some exceptions… well, of course, there had to be exceptions!

My implementation calls reap() to clean up dead keys (keys pointing to garbage collected values) in the cache only when a dead key is found via a get() call, or when the cache size is checked via getSize(). If you have a cache where, for some reason, no attempt is ever made to retrieve dead keys from the cache, and the getSize() function is never called, the set of dead keys may increase out of bounds, causing an out of memory condition. This is really a way out there boundary situation, but do keep it in mind, just in case.

The more serious case where the cache could grow out of bounds is when the rate at which data is pushed into the cache exceeds the rate at which the garbage collector can clean it up. In the Java world this is sometimes referred to as object churn, and there’s not much that can be done about it without adding constraints to the cache which may affect performance.

If you need a really safe cache, you’ll have to add cflock tags over whatever scope might be appropriate, wrapping all calls to the cache - this will ensure that when the cache is being reaped, put() calls will be locked out, so that the cache will not be able to grow out of bounds.

Take it for a spin! Here’s the test code I wrote - you may want to push the loop iterations and string length up or down depending on the box you run this on. loadCache.cfm loads the cache with data, with the size of the data growing as the loop rolls out. checkCache.cfm tells you how many items there are in the cache, and the JVM free and total memory. Keep running checkCache.cfm while loadCache.cfm is executing to see how the cache is behaving vis-a-vis JVM free/total memory.

—loadCache.cfm—

<cfset server.cache = createObject("component", "softcache")>

<cfset bigstring = "xxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxl">

<cfloop from="0" to="20000" index="i" step="1">
<cfset server.cache.put("key#i#", bigstring)>
<cfset bigstring = bigstring &
"xxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxlxxxxl">
</cfloop>

<cfoutput>Inserted items, cache size: #server.cache.getSize()#</cfoutput>

—checkCache.cfm—

<cfset size = server.cache.getSize()>

<cfset runtime = createObject("java", "java.lang.Runtime").getRuntime()>
<cfoutput>JVM free memory: #runtime.freeMemory()#</cfoutput><br>
<cfoutput>JVM total memory: #runtime.totalMemory()#</cfoutput><br>
<cfoutput>Cache size: #size#</cfoutput>

If you push the loop iterations and/or the string size high enough, you should find that the cache size never hits the number of iterations in the loop. You should also see that as the loop rolls out, ever-larger entries are put into the cache, the JVM free memory shrinks, and garbage collections become more frequent, leading to a bit of a bungee effect on the JVM free memory - it keeps jumping up as the cache is loaded, and then goes back down as the garbage collector frees the soft references.

In my tests, with a loop size of 20000 strings as above, I ended up with a cache size of 957, which stayed fairly stable once it got there. So there you go, a cache that will resize itself based on available memory - enjoy!