Improving NGINX LUA cache purges

A few months ago I wrote an article on how to clear single cache items from NGINX with LUA, with a simple O(1) complexity. While it’s certainly possible to use it for simple and low-impact cache invalidation, there is a need for deleting huge parts of the cache with a wildcard pattern.

The NGINX cache store

NGINX creates individual cache files based on the definition of the proxy_cache_path setting. The cache definition specifies two important things - the path where the cached items will be stored, and the levels.

Cache data are stored in files. The file name in a cache is a result of applying the MD5 function to the cache key. The levels parameter defines hierarchy levels of a cache: from 1 to 3, each level accepts values 1 or 2.

Based on this information, we know that a nginx cache item can have at most 3 levels, and each level may have at most 256 folders (from hex 00 - decimal zero, to hex FF - decimal 255). The reason for the levels parameter is speed or stability - generally accessing and storing a file is a O(1) operation, regardless of how many files you store in a folder. That being said, the way different filesystems store files into folders is a factor, which can be mitigated by this parameter. For example:

The Reiser File System is the default file system in SUSE Linux distributions. Reiser FS was designed to remove the scalability and performance limitations that exist in EXT2 and EXT3 file systems. It scales and performs extremely well on Linux, outscaling EXT3 with htrees. In addition, Reiser was designed to very efficiently use disk space. As a result, it is the best file system on Linux where there are a great number of small files in the file system. As collaboration (email) and many web serving applications have lots of small files, Reiser is best suited for these types of workloads.

Source: File System Primer, File System Comparison.

So in the case when you’re using ReiserFS, the levels parameter could be completely redundant. The only problem is, the filesystems being shipped by default on any cloud provider or linux installation (Ubuntu, Debian) are usually set to ext4, which suffers from much of the same limitations. Setting your levels parameter, depending on the number of files you intend to store is usually preferable to deploying a custom filesystem just for the NGINX cache. That being said, there’s also tmpfs, shm or ramdisk type of “filesystems” (they are not really filesystems, they are just a step above block storage) - those filesystems are extraordinarily fast as they store all the data on them in RAM. As such, they are ideal for high-impact low size caches, ie. something that gets a million hits but may be stored just for a minute.

Traversing the NGINX cache

Individual cache files contain a “KEY:” value on the first few lines, which needs to be read in order to perform maching against a wildcard pattern. Linux in my case provides several tools that help with finding and matching these files, so it doesn’t all need to be done with LUA code. The tools are:

  1. find - the finder for types (does directory traversal and lists files),
  2. xargs - the tool to pass output of a command into another command,
  3. grep - a very old tool to provide matching of text and file contents

With these tools we can build the complete list of files which must be deleted given a wildcard (regular expression) pattern.

Using find

Using find is the simplest of all. Given a cache_path, we can instruct it to find all files under this directory. As nginx doesn’t store any additional information in the cache path, it’s safe to asume that only cache files will be listed.

find [cache_path] -type f

Using grep

When you have a cache filename, you can find out if it matches your purge_pattern. There are a few options however to add to the grep command, to make it a bit more optimal for our use case:

The -E option:

Interpret PATTERN as an extended regular expression

The -m 1 option:

Stop reading a file after NUM matching lines.

We use the -m option to avoid scanning the file further for additional occurences of ‘^KEY…’. This may be redundant with the following option:

The -l option:

Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning will stop on the first match.

So in general, if we want to know which files in the cache match our PURGE pattern, you would issue grep like this:

grep -El -m 1 '^KEY: [purge_upstream][purge_pattern]' [list_of_files]

Putting it all together with xargs

So, with find we list all the files in the cache, and with grep we can see which of these files can match our pattern. There are however certain limitations of linux systems in regards to argument lists: there may be too many files to pass to grep directly and you might end up with a Argument list too long error - thankfully it’s easily solvable with a switch to xargs that limits how many files will be passed to grep in a single run.

The -n max-args, --max-args=max-args option

Use at most max-args arguments per command line.

And of course, we have to consider that no files may be found, and we can avoid the run of grep in that case:

find [cache_path] -type f | xargs --no-run-if-empty -n1000 grep -El -m 1 '^KEY: [purge_upstream][purge_pattern]'

And voila, the final solution to the problem of a wildcard purge. The only thing we need to do is to pass the required configuration to our updated LUA code:

Additional improvements

As we hand off the list of matching files to LUA to delete without additional system calls, we can output an additional HTTP header to count the number of files that have been deleted. For example, a small test script:

#!/bin/bash
curl -s -i http://peer.lan/upload/upload.txt > /dev/null
curl -s -i http://peer.lan/upload/upload2.json > /dev/null

OK="^(OK|HTTP|X-)"

curl -s -X PURGE -i http://peer.lan/upload/.*txt | egrep $OK
curl -s -X PURGE -i http://peer.lan/upload/upload2.* | egrep $OK
curl -s -X PURGE -i http://peer.lan/upload/.* | egrep $OK

curl -s -i http://peer.lan/upload/upload.txt > /dev/null
curl -s -i http://peer.lan/upload/upload2.json > /dev/null

curl -s -X PURGE -i http://peer.lan/upload/.* | egrep $OK

I request two different files, and then PURGE them with wildcards, but individually. When the files have been purged the result is the expected X-Purged-Count header value. Recreating the cached items after purge and then purging them with a wildcard that matches them both, produces the expected output.

# purge ".*txt"
HTTP/1.1 200 OK
X-Purged-Count: 1
OK

# purge "upload2.*"
HTTP/1.1 200 OK
X-Purged-Count: 1
OK

# purge ".*" (all) when cache is empty
HTTP/1.1 200 OK
X-Purged-Count: 0
OK

# purge ".*" (all) after recreating cached items
HTTP/1.1 200 OK
X-Purged-Count: 2
OK

Configuring NGINX

All that’s left to do is to add our purge-multi.lua script into the location block where you would allow purges.

if ($request_method = PURGE) {
	set $lua_purge_path "/tmp/cache1/";
	set $lua_purge_upstream "http://dev";
	content_by_lua_file $site_root/lua/purge-multi.lua;
}

Depending on your proxy_cache_key setting in NGINX, you could set the $lua_purge_upstream value to an empty string (""). By default the proxy_cache_key is set to $scheme$proxy_host$request_uri, of which $scheme$proxy_host define the “upstream”.

Thanks

The improvement to the LUA plugin has been sponsored Carlos Mirando Molina, based on the discussion in the reddit thread of the original post. A big thanks to him for supporting open source (and developers/ops people like myself).

While I have you here...

It would be great if you buy one of my books:

I promise you'll learn a lot more if you buy one. Buying a copy supports me writing more about similar topics. Say thank you and buy my books.

Feel free to send me an email if you want to book my time for consultancy/freelance services. I'm great at APIs, Go, Docker, VueJS and scaling services, among many other things.