Full page caching in Rails with Nginx and Redis


Update Oct 4, 2020: I wrote another post on full page caching with memcached and middleware as a simpler alternative that requires fewer dependencies. Check that out too if you prefer using a cluster for your cache instead of a standalone instance, for scalability.
The app I am currently working on is a CMS that allows users to create their own sites. These sites are publicly accessible and do not require authentication or user-customised content, so they are a great candidate for full page caching.
Out of the box, Rails comes with several great caching features that make implementing caching very easy in most cases. However for full page caching a gem is required. The functionality to cache entire pages was in fact extracted from Rails into a separate gem a while ago, because cache invalidation with this kind of technique can be tricky and therefore fragment caching is the recommended caching method for most use cases, especially with the "Russian Doll" technique.
But like I said, pages that look and behave the same for everyone and do not require authentication would benefit more from full page caching, which offers the best performance possible since a webserver like Nginx can serve pages directly, completely bypassing the entire Rails application stack.
My first implementation of full page caching for this app was with the default functionality of the gem, which caches pages to disk; in order to share the cache among multiple servers, I was also using NFS. It worked well but, despite I tried to implement file operations carefully, I had a couple of issues that I think were caused by a race condition between cache invalidation and site preloading, which are two features in the toolbox of my CMS.
Besides that, there are some known "risks" with a network based file system like NFS. There are other options that might be more reliable, but that would just add more complexity to my setup.
For these reasons, I decided to switch to Redis as cache store. This way my cache can be still shared by multiple servers and should also be slightly faster since there isn't the overhead of a filesystem over the network.
Nginx can use Redis for caching with a module, but this module isn't included with the version of Nginx that comes with default packages for Linux distributions, so you need to compile it from source in order to add the required module. It requires a few more steps but it's not complicated. You could also use Memcached instead of Redis, since Nginx has built in support for it, but Memcached doesn't have a feature that allows to selectively delete a bunch of keys matching a pattern in one shot. Redis does.
Installing Nginx from source with the Redis module
I do all of this in Docker, so here I will just add the relevant commands. The configure-compile-install steps for Nginx with the Redis module are super easy, but you need to install some dependencies first. On Debian and Ubuntu you can install these dependencies with the following commands:
apt-get update
apt-get install libpcre3 libpcre3-dev perl libperl-dev libgd3 \
libgd-dev libgeoip1 libgeoip-dev geoip-bin libxml2 libxml2-dev libxslt1.1 libxslt1-dev
Then, run the following simple commands to compile and install Nginx with the Redis module:
wget http://nginx.org/download/nginx-1.18.0.tar.gz
tar xfvz nginx-1.18.0.tar.gz
wget https://people.freebsd.org/~osa/ngx_http_redis-0.3.9.tar.gz
tar xvfz ngx_http_redis-0.3.9.tar.gz
cd nginx-1.18.0
./configure --with-ld-opt='-Wl,-z,relro -Wl,-z,now -fPIC' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf \
--http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock \
--pid-path=/run/nginx.pid --modules-path=/usr/lib/nginx/modules --http-client-body-temp-path=/var/lib/nginx/body \
--http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy \
--http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit \
--with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module \
--with-http_v2_module --with-http_dav_module --with-http_slice_module --with-threads --with-http_addition_module \
--with-http_geoip_module=dynamic --with-http_gunzip_module --with-http_gzip_static_module \
--with-http_image_filter_module=dynamic --with-http_sub_module --with-http_xslt_module=dynamic --with-stream=dynamic \
--with-stream_ssl_module --with-stream_ssl_preread_module --with-mail=dynamic --with-mail_ssl_module \
--add-dynamic-module=../ngx_http_redis-0.3.9
make
make install
The configuration options should match those of a default installation with a distro's package. Note the bit that adds the Redis module at the end of the configure command. Compiling will take a minute or so.
We also need a script to start/stop Nginx - I borrowed the following from somewhere:
#!/bin/sh
### BEGIN INIT INFO
# Provides: nginx
# Required-Start: $local_fs $remote_fs $network $syslog $named
# Required-Stop: $local_fs $remote_fs $network $syslog $named
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: starts the nginx web server
# Description: starts nginx using start-stop-daemon
### END INIT INFO
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=/usr/sbin/nginx
NAME=nginx
DESC=nginx
# Include nginx defaults if available
if [ -r /etc/default/nginx ]; then
. /etc/default/nginx
fi
STOP_SCHEDULE="${STOP_SCHEDULE:-QUIT/5/TERM/5/KILL/5}"
test -x $DAEMON || exit 0
. /lib/init/vars.sh
. /lib/lsb/init-functions
# Try to extract nginx pidfile
PID=$(cat /etc/nginx/nginx.conf | grep -Ev '^\s*#' | awk 'BEGIN { RS="[;{}]" } { if ($1 == "pid") print $2 }' | head -n1)
if [ -z "$PID" ]; then
PID=/run/nginx.pid
fi
if [ -n "$ULIMIT" ]; then
# Set ulimit if it is set in /etc/default/nginx
ulimit $ULIMIT
fi
start_nginx() {
# Start the daemon/service
#
# Returns:
# 0 if daemon has been started
# 1 if daemon was already running
# 2 if daemon could not be started
start-stop-daemon --start --quiet --pidfile $PID --exec $DAEMON --test > /dev/null \
|| return 1
start-stop-daemon --start --quiet --pidfile $PID --exec $DAEMON -- \
$DAEMON_OPTS 2>/dev/null \
|| return 2
}
test_config() {
# Test the nginx configuration
$DAEMON -t $DAEMON_OPTS >/dev/null 2>&1
}
stop_nginx() {
# Stops the daemon/service
#
# Return
# 0 if daemon has been stopped
# 1 if daemon was already stopped
# 2 if daemon could not be stopped
# other if a failure occurred
start-stop-daemon --stop --quiet --retry=$STOP_SCHEDULE --pidfile $PID --name $NAME
RETVAL="$?"
sleep 1
return "$RETVAL"
}
reload_nginx() {
# Function that sends a SIGHUP to the daemon/service
start-stop-daemon --stop --signal HUP --quiet --pidfile $PID --name $NAME
return 0
}
rotate_logs() {
# Rotate log files
start-stop-daemon --stop --signal USR1 --quiet --pidfile $PID --name $NAME
return 0
}
upgrade_nginx() {
# Online upgrade nginx executable
# http://nginx.org/en/docs/control.html
#
# Return
# 0 if nginx has been successfully upgraded
# 1 if nginx is not running
# 2 if the pid files were not created on time
# 3 if the old master could not be killed
if start-stop-daemon --stop --signal USR2 --quiet --pidfile $PID --name $NAME; then
# Wait for both old and new master to write their pid file
while [ ! -s "${PID}.oldbin" ] || [ ! -s "${PID}" ]; do
cnt=`expr $cnt + 1`
if [ $cnt -gt 10 ]; then
return 2
fi
sleep 1
done
# Everything is ready, gracefully stop the old master
if start-stop-daemon --stop --signal QUIT --quiet --pidfile "${PID}.oldbin" --name $NAME; then
return 0
else
return 3
fi
else
return 1
fi
}
case "$1" in
start)
log_daemon_msg "Starting $DESC" "$NAME"
start_nginx
case "$?" in
0|1) log_end_msg 0 ;;
2) log_end_msg 1 ;;
esac
;;
stop)
log_daemon_msg "Stopping $DESC" "$NAME"
stop_nginx
case "$?" in
0|1) log_end_msg 0 ;;
2) log_end_msg 1 ;;
esac
;;
restart)
log_daemon_msg "Restarting $DESC" "$NAME"
# Check configuration before stopping nginx
if ! test_config; then
log_end_msg 1 # Configuration error
exit $?
fi
stop_nginx
case "$?" in
0|1)
start_nginx
case "$?" in
0) log_end_msg 0 ;;
1) log_end_msg 1 ;; # Old process is still running
*) log_end_msg 1 ;; # Failed to start
esac
;;
*)
# Failed to stop
log_end_msg 1
;;
esac
;;
reload|force-reload)
log_daemon_msg "Reloading $DESC configuration" "$NAME"
# Check configuration before stopping nginx
#
# This is not entirely correct since the on-disk nginx binary
# may differ from the in-memory one, but that's not common.
# We prefer to check the configuration and return an error
# to the administrator.
if ! test_config; then
log_end_msg 1 # Configuration error
exit $?
fi
reload_nginx
log_end_msg $?
;;
configtest|testconfig)
log_daemon_msg "Testing $DESC configuration"
test_config
log_end_msg $?
;;
status)
status_of_proc -p $PID "$DAEMON" "$NAME" && exit 0 || exit $?
;;
upgrade)
log_daemon_msg "Upgrading binary" "$NAME"
upgrade_nginx
log_end_msg $?
;;
rotate)
log_daemon_msg "Re-opening $DESC log files" "$NAME"
rotate_logs
log_end_msg $?
;;
*)
echo "Usage: $NAME {start|stop|restart|reload|force-reload|status|configtest|rotate|upgrade}" >&2
exit 3
;;
esac
Copy the script to /etc/init.d/nginx. Then you need to either change the path of the nginx executable in the script, or just create a symlink with the command below, since installing from source installs the executable in a different location than the one specified in the script:
ln -s /usr/share/nginx/sbin/nginx /usr/sbin/nginx
Patching the page caching gem to use Redis instead of the file system
Nginx is now installed, so we can jump to the Rails part. Like I mentioned earlier, by default the actionpack-page_caching gem caches pages to disk. Rather than recreating the same functionality from scratch just because of Redis, I opted for a simple monkey patch that overrides the delete and write methods to use Redis (check the source code of the gem to see the original implementation). This could become a PR.
So after adding the gem, I added the following to an initializer:
require "action_controller/caching/pages"
Rails.configuration.to_prepare do
ActionController::Caching::Pages::PageCache.class_eval do
private
def delete(path)
return unless path
Rails.cache.delete(path)
end
def write(content, path, gzip)
return unless path
Rails.logger.info "REDIS CACHE: #{path}"
Rails.cache.write(path, content, raw: true)
end
def cache_path(path, extension = nil)
domain = normalized_cache_directory.gsub(/\/app\/(.*)/, "\\1")
path = "#{domain}#{path}"
path = path + "/" unless path.ends_with?("/")
path
end
end
end
Pretty simple. In my case, because user sites can have custom domains, I add the domain to the path that is used as key in Redis. This way, if I want to invalidate the cache for just one domain without affecting the cache for the other domains, I can use a delete-by-pattern feature available with the Redis cache store, e.g.
Rails.cache.delete_matched "#{domain}*"
Configuring the cache store
Of course, we need to ensure that Redis is used as cache store, since that's not the default in Rails. To do this, open development.rb or production.rb and change any line that starts with config.cache_store with:
config.cache_store = :redis_cache_store, { url: ... }
of course you'll need to specify the correct URL for your Redis instance. Note that caching is disabled by default for the development environment, so to test in development you'll need to run
rails dev:cache
which will enable caching. You can find more details on using Redis as cache store here.
Enabling the caching in the controller
Enabling the full page caching is now just a matter of specifying which actions in a controller we want to cache, e.g.
caches_page :index
The above will enable the cache for the index action only, but you can specify multiple actions separated by commas.
In my app, because I am using custom domains, I want to prefix the path used as cache key with the domain, so that I can expire the cache for a single domain as shown earlier. To do this, I have added the following to the controller, before the caches call:
self.page_cache_directory = -> { request.hostname }
Making Nginx aware of the Redis cache
Nginx is not yet configured to use the Redis cache, so we need to do that next. In the server block for your app add the following:
set $redis_db "0";
set $redis_key $host$uri;
set $use_redis_cache 1;
if ($request_method != GET ) {
set $use_redis_cache 0;
}
So we specify the Redis database to use (this must match the db number specified in the connection URL in Rails), and define the cache key so that it is made of the domain and the request path, just as configured in Rails. If you don't use multiple domains, the request path ($uri) is enough.
Then we make sure we only query the cache if it's a GET request. This way POST and other kinds of requests that change data will bypass the cache and go straight to the app.
Next, we'll use these variables in a location block as follows:
location / {
default_type "text/html";
if ($use_redis_cache = 1) {
redis_pass <redis-host-and-port>;
error_page 404 = @app;
}
try_files index.html @app;
}
As you can see, if the cache is enabled we first check with Redis if the given key is present, and if that's the case the content of the key is returned as response. We set a default content type otherwise Nginx wouldn't know which content type to use for the content fetched from Redis. If the key is not present, resulting in a 404, we pass the request to the app. Similarly, if the cache is disabled because it's not a GET request, we go straight to the app. The @app definition can be something like the following:
location @app {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_pass <app host or upstream>$request_uri;
}
Finally, we just need to start the Nginx service:
service nginx start
That's it. GET requests should now be cached in Redis and go to the app only if the relevant keys are not present in Redis, resulting in much better performance when serving those pages.
Conclusions
I have simplified the configuration a lot for this post. The actual configuration for my app is quite a bit more complex because there are various things involved, but this example should get you started. I'm happy with this caching setup because I have very good performance with a cache that can be shared by multiple servers and I can avoid complications that come with using a network file system instead of Redis. In my case I needed to be able to expire the cache for a specific domain without having to delete all the keys for all the pages of that domain manually; this is easy with Redis cache store but is not available with Memcached. If you don't need this in your app and only need to delete specific keys each time, then perhaps I'd go with Memcached since it's supported by Nginx without additional modules.
Like I said I added only basic information for you to get started so to keep this post short, but let me know in the comments if you need some more info or if you get stuck. Hope it helps.