has_many :codes

Optimised Docker builds for Rails apps

Published  

Since I started using Docker and Kubernetes for my apps, I have been trying to optimise the way my images are built, so to make the builds as fast as possible while keeping the images as small as possible at the same time. In this post I would like to share some tips re: building Docker images, especially for Rails apps. We’ll see how to optimise image layer caching, as well as how to reduce the image size where possible; speaking of the caching, we’ll also see how to leverage an external storage (I use Wasabi, an affordable s3 compatible storage service) to cache the gem bundle; this way, if we make changes to the Gemfile, we won’t be able to benefit from the layer caching but we can use a bundle cached on s3 so we only need to install new gems, instead of installing everything from scratch - which can slow down our builds considerably; in fact, downloading a compressed bundle from s3 and using that to only install new gems, is typically a lot faster than reinstalling the whole bundle. Using the same trick, we will also cache precompiled assets, so to speed up that step as well whenever we change static assets. An additional benefit of using an external storage for this caching, is that the cache is shared between multiple environments. For example, in my case I build images both on my Mac and in CI/CD with Github Actions.

For the base OS image I have been switching between Alpine and Debian/Ubuntu every now and then, but I think I finally settled on Debian-slim. It is a fact that Alpine makes for faster builds and much smaller images out of the box, however at times I’ve had to fight with incompatibilities or other issues due to musl-C being the libc of choice. Debian-slim may not have as small as footprint as Alpine, but it’s still quite small as base image and works basically with everything without hacks and workarounds, so for now I will stick with it and I suggest you do too unless you are paranoid about image size.

For this post, I will assume some familiarity with Docker and Docker builds, so I am going to skip the basics.

The dev/CI Dockerfile

For my app, I am currently using two Dockefiles: one for development and test/CI builds, and the other for production which is based on the first Dockerfile and makes use of multi stage builds. Let’s build the first Dockerfile, which we’ll call Dockerfile.dev, one step per time.

The very first lines of this Dockerfile specify Debian-slim as the base image, as well as some build args:

FROM debian:buster-slim

ARG RUBY_VERSION=2.6.5-jemalloc
ARG S3_ACCESS_KEY_ID=""
ARG S3_SECRET_ACCESS_KEY=""
ARG S3_REGION=eu-central-1
ARG S3_ENDPOINT=s3.eu-central-1.wasabisys.com
ARG BUILD_PACKAGES="build-essential yarn git-core vim tmux unzip python3-pip python3-setuptools zlib1g-dev libssl-dev"
ARG REQUIRED_PACKAGES="fullstaq-ruby-$RUBY_VERSION fullstaq-rbenv nodejs tzdata imagemagick libpq-dev"

Here we are setting the version of Ruby we want to install, as well as the s3 settings (also as build args; do NOT add these directly to the Dockerfile or these secrets will be persisted in the Docker image) that we need to download/upload the cached bundle and assets from/to s3. We are also specifying which packages are required for building stuff, and which packages are always required - this one will be identical in the prod Dockerfile. You can customise both depending on your app and how you would like your dev environment to look. One thing you may be wondering is why I am using Debian-slim instead of the official Ruby image. The reason is that I prefer using a version of Ruby that takes advantage of jemalloc to reduce the memory consumption significantly. The easiest way I found to achieve this is with Fullstaq Ruby, a version of Ruby that includes jemalloc and other optimisations that make it lighter and often faster. Fullstaq Ruby provides some packages we can install in Debian to get up and running with Ruby, so that’s what we will be doing.

One common best practice when building Docker images is placing the layers that do not change often as close to the top of the Dockerfile as possible, so for me the next few instructions are the following:

EXPOSE 3000

CMD ["bundle", "exec", "puma", "-Cconfig/puma.rb"]

ENV GEM_HOME="/home/rails/bundle"
ENV RAILS_ENV=development \
  RAILS_ROOT=/app \
  SECRET_KEY_BASE=foo \
  RAILS_SERVE_STATIC_FILES=true \
  RAILS_LOG_TO_STDOUT=true \
  BUNDLE_PATH="$GEM_HOME" \
  BUNDLE_APP_CONFIG="$GEM_HOME" \
  BUNDLE_SILENCE_ROOT_WARNING=1 \
  LANG=C.UTF-8 LC_ALL=C.UTF-8
ENV PATH $GEM_HOME/bin:$BUNDLE_PATH/gems/bin:/usr/lib/fullstaq-ruby/versions/$RUBY_VERSION/bin:${HOME}/bin:$PATH

So we expose the port 3000 which is the default for the Rails server, and set some envivonment variables that should be self explanatory. The next step is optional, and replaces the default Debian repository with a mirror of our choice, since mirrors are less frequently used than the official repo and therefore they are often faster:

RUN sed -i 's/http\:\/\/deb.debian.org/http\:\/\/ftp.de.debian.org/g' /etc/apt/sources.list

My servers are located in Germany, so I am using a German mirror. Again, this is optional. Next, we’ll add a couple of additional repositories to install Fullstaq Ruby as well as yarn and nodejs which are required for the assets precompilation:

RUN apt-get update -qq && DEBIAN_FRONTEND=noninteractive apt-get install -yq --no-install-recommends wget apt-transport-https ca-certificates curl gnupg2

RUN echo "deb https://apt.fullstaqruby.org debian-10 main" > /etc/apt/sources.list.d/fullstaq-ruby.list \
  && curl -SLfO https://raw.githubusercontent.com/fullstaq-labs/fullstaq-ruby-server-edition/master/fullstaq-ruby.asc \
  && apt-key add fullstaq-ruby.asc

RUN curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add - \
  && echo 'deb http://dl.yarnpkg.com/debian/ stable main' > /etc/apt/sources.list.d/yarn.list

RUN curl -sL https://deb.nodesource.com/setup_13.x | bash -

We can then install all the packages in one go to optimise layer caching:

RUN apt-get update -qq \
  && DEBIAN_FRONTEND=noninteractive apt-get upgrade -yq \
  && DEBIAN_FRONTEND=noninteractive apt-get install -yq --no-install-recommends ${BUILD_PACKAGES} ${REQUIRED_PACKAGES} \
  && pip3 install awscli \
  && GOOGLE_LINUX_DL=https://dl.google.com/linux \
    && curl -sL "$GOOGLE_LINUX_DL/linux_signing_key.pub" | apt-key add - \
    && curl -sL "$GOOGLE_LINUX_DL/direct/google-chrome-stable_current_amd64.deb" \
      > /tmp/chrome.deb \
    && apt install --no-install-recommends --no-install-suggests -y \
      /tmp/chrome.deb \
    && CHROMIUM_FLAGS='--no-sandbox --disable-dev-shm-usage' \
    && sed -i '${s/$/'" $CHROMIUM_FLAGS"'/}' /opt/google/chrome/google-chrome \
    && BASE_URL=https://chromedriver.storage.googleapis.com \
    && VERSION=$(curl -sL "$BASE_URL/LATEST_RELEASE") \
    && curl -sL "$BASE_URL/$VERSION/chromedriver_linux64.zip" -o /tmp/driver.zip \
    && unzip /tmp/driver.zip \
    && chmod 755 chromedriver \
    && mv chromedriver /usr/local/bin/ \
  && apt-get clean \
  && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
  && truncate -s 0 /var/log/*log

As you can see, I am installing the AWS CLI to interact with s3, and Google Chrome plus the chromedriver as well, to support system tests with Capybara; you can skip this if that’s not the case for your app. Note the last three bits of this RUN command. We are cleaning up everything concerning the apt cache in order to reduce the image size a bit.

In the next step, we set the version of Ruby we want to use with rbenv (which is used by Fullstaq Ruby) - at the moment of this writing 2.6.5 is the latest version available for Ruby 2.6 - and update both Rubygems and bundler. Then we set the work directory to the app directory /app, and create a couple of directories without which some following steps would fail.

RUN rbenv global $RUBY_VERSION && gem update --system && gem install bundler

WORKDIR $RAILS_ROOT

RUN mkdir -p ./tmp/pids && mkdir -p ./log

The next steps are designed to optimise layer caching significantly.

COPY package.json yarn.lock $RAILS_ROOT/
RUN yarn install

COPY bin/install-gems $RAILS_ROOT/bin/
COPY Gemfile Gemfile.lock $RAILS_ROOT/
RUN ./bin/install-gems

We don’t copy the whole Rails app to the image yet, because otherwise any change to any file, however small, would invalidate the cache from this point on, completely. Instead, we first copy the package.json and yarn.lock files because these files don’t change often. If they do change, we install the packages with yarn. Same things with the gem bundle: we copy a script to install the gems as well as the Gemfile; if these change, then we install the gems. You can see here that I am not executing bundle install directly; instead, I am using a script that will use s3 as cache for the bundle as explained earlier. Let’s see the contents of that script:

#!/bin/bash
set -ex

bundler_path="/home/rails/bundle"
archive="bundle.tar.gz"
s3_path="s3://mybucket/$archive"

if AWS_ACCESS_KEY_ID=$S3_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY=$S3_SECRET_ACCESS_KEY aws s3 --region=$S3_REGION --endpoint-url=https://$S3_ENDPOINT cp $s3_path . ; then
  tar -xzf $archive -C / && rm -f $archive
fi

bundle install -j "$(getconf _NPROCESSORS_ONLN)" --retry 3
bundle binstubs bundler --force

find $bundler_path/ -name "*.c" -delete
find $bundler_path/ -name "*.o" -delete

bundle pack --quiet

tar -zcf $archive $bundler_path

AWS_ACCESS_KEY_ID=$S3_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY=$S3_SECRET_ACCESS_KEY aws s3 --region=$S3_REGION --endpoint-url=https://$S3_ENDPOINT cp $archive $s3_path

rm -f $archive

In this script, we first try to download the bundle from s3 if it exists already; if it does, we extract it to the bundle directory and then run bundle install to install any new gems according to changes made to the Gemfile. We are making sure bundle can install gems with as many parallel jobs as the number of cores in the system, in order to speed up the installation. After installing the gems, we do some cleanup, and reupload the updated archive to s3, so it’s ready for use with the next build.

Note that we have installed the AWS CLI in a previous step, and the S3 settings are defined at the top of the Dockerfile as build args; we’ll see later how and where I set these variables.

Once the bundle is sorted out, it’s the turn of the assets precompilation. We copy a number of files and dirctories that are needed for the precompilation step to succeed, so we precompile the assets only if any of those files have changed.

COPY ./postcss.config.js ./tailwind.config.js ./babel.config.js $RAILS_ROOT/
COPY ./Rakefile $RAILS_ROOT/Rakefile
COPY ./app/models/application_record.rb $RAILS_ROOT/app/models/application_record.rb
COPY ./app/assets $RAILS_ROOT/app/assets
COPY ./bin $RAILS_ROOT/bin
COPY ./lib $RAILS_ROOT/lib
COPY ./app/models/constraints $RAILS_ROOT/app/models/constraints
COPY ./app/models/user.rb $RAILS_ROOT/app/models/
COPY ./app/models/*_uploader.rb $RAILS_ROOT/app/models/
COPY ./app/models/liquid $RAILS_ROOT/app/models/liquid
COPY ./config $RAILS_ROOT/config
COPY ./app/javascript $RAILS_ROOT/app/javascript

RUN ./bin/precompile-assets \
  && yarn cache clean

Of course you may have to change the files and directories according to your app; the lines above are taken from the actual Dockerfile I use for my current app. Note the yarn cache clean command: this alone can clean up 100-200MB of stuff that would bloat the final image. As you can see, we use another script here which will cache the precompiled assets to s3. This script is pretty similar to the previous:

#!/bin/bash
set -ex

assets_path="public/packs"
archive="assets.tar.gz"
s3_path="s3://mybucket/$archive"

if AWS_ACCESS_KEY_ID=$S3_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY=$S3_SECRET_ACCESS_KEY aws s3 --region=$S3_REGION --endpoint-url=https://$S3_ENDPOINT cp $s3_path . ; then
  tar -xzf $archive && rm -f $archive
fi

bundle exec rake webpacker:compile RAILS_ENV=production NODE_ENV=production

tar -zcf $archive $assets_path tmp/cache/webpacker/last-compilation-digest-production

AWS_ACCESS_KEY_ID=$S3_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY=$S3_SECRET_ACCESS_KEY aws s3 --region=$S3_REGION --endpoint-url=https://$S3_ENDPOINT cp $archive $s3_path

rm -f $archive

I am running the webpacker:compile rake task because I am not using the asset pipeline. If you are, you can run assets:precompile instead.

Finally, we ensure that the rest of the app is copied over:

COPY . $RAILS_ROOT

The dev Dockerfile is now ready. To use it, I have another script at bin/run that has a bunch of things, including the following:

IMAGE="user/repo"

POSITIONAL=()
while [[ $# -gt 0 ]]
do
key="$1"

case $key in
  build)
    shift

    source ~/.s3/wasabi

    VERSION=${VERSION:-"v`git rev-parse --short HEAD`"}

    export DOCKER_BUILDKIT=1

    case "$1" in
      ci)
        echo ${VERSION} > current_version.ci
        docker build -t ${IMAGE}:ci --cache-from ${IMAGE}:ci --build-arg BUILDKIT_INLINE_CACHE=1 --network=host -f Dockerfile.dev --build-arg S3_ACCESS_KEY_ID="$ACCESS_KEY" --build-arg S3_SECRET_ACCESS_KEY=$SECRET_KEY .
        shift
      ;;
      prod)
        echo ${VERSION} > current_version.prod
        docker build -t ${IMAGE}:prod --cache-from ${IMAGE}:ci --cache-from ${IMAGE}:prod --build-arg BUILDKIT_INLINE_CACHE=1 --network=host -f Dockerfile.prod --build-arg S3_ACCESS_KEY_ID="$ACCESS_KEY" --build-arg S3_SECRET_ACCESS_KEY=$SECRET_KEY .
        docker tag ${IMAGE}:prod ${IMAGE}:${VERSION}
        shift
      ;;
      *)
        echo ${VERSION} > current_version.dev
        docker build -t ${IMAGE}:dev --build-arg BUILDKIT_INLINE_CACHE=1  --network=host -f Dockerfile.dev --build-arg S3_ACCESS_KEY_ID="$ACCESS_KEY" --build-arg S3_SECRET_ACCESS_KEY=$SECRET_KEY .
        docker tag ${IMAGE}:dev ${IMAGE}:${VERSION}
      ;;
    esac

    echo $VERSION
  ;;

  push)
    ENV=dev

    shift

    if [ -n "$1" ]; then
      ENV="$1"
      shift
    fi

    VERSION=${VERSION:-`cat current_version.${ENV}`}
    docker push ${IMAGE}:${ENV}
    docker push ${IMAGE}:${VERSION}
    echo $VERSION
  ;;
  *)
    POSITIONAL+=("$1")
    shift
  ;;
esac
done
set -- "${POSITIONAL[@]}"

So we have a build command as well as a push command. With build, we first source a file outside the Git repository with the secrets for s3:

ACCESS_KEY=...
SECRET_KEY=...
REGION=eu-central-1
ENDPOINT=s3.eu-central-1.wasabisys.com

Then we set the VERSION variable to the current Git SHA if VERSION has not been set already (in CI with Github Actions, it is set by the pipeline already), and then we build the expected image depending on the parameter passed for the environment, so an image for either test/CI, or production, or dev. One important thing to note here is that we are using Buildkit since it’s faster than building images the default way. Also, we are using –cache-from to suggest Docker which existing images might be useful to speed up building; this doesn’t change anything when building locally, but in some cases where Docker is completely reset between builds (like with Github Actions), we cannot leverage previous builds directly because there are none, so we need to download existing images and use those as cache.

With this in place, we can build and push a dev CI or prod image with:

bin/run build <environment>
bin/run push <environment>

For dev images, we can just omit the environment. BTW in the run script I am using –network=host; this is because in some environments (like when a Github Actions runner runs in Kubernetes with Docker-in-Docker), there might be occasional networking issues. Long story short, using the host network for builds fixes these issues.

The prod Dockerfile

The production Dockerfile is very simple: it does a multi stage build starting from the image built for CI. For prod we don’t require any dependency that is used to build software or run tests, so the final image is much smaller. I’ll just paste the content of my Dockerfile.prod since it should be self explanatory by now:

FROM user/repo:ci AS dev

FROM debian:buster-slim

ARG RUBY_VERSION=2.6.5-jemalloc
ARG REQUIRED_PACKAGES="git-core fullstaq-ruby-$RUBY_VERSION fullstaq-rbenv nodejs tzdata imagemagick libpq-dev"

EXPOSE 3000

CMD ["bundle", "exec", "puma", "-Cconfig/puma.rb"]

ENV GEM_HOME="/home/rails/bundle"
ENV RAILS_ENV=production \
  RAILS_ROOT=/app \
  SECRET_KEY_BASE=foo \
  RAILS_SERVE_STATIC_FILES=true \
  RAILS_LOG_TO_STDOUT=true \
  BUNDLE_PATH="$GEM_HOME" \
  BUNDLE_APP_CONFIG="$GEM_HOME" \
  BUNDLE_SILENCE_ROOT_WARNING=1 \
  LANG=C.UTF-8 LC_ALL=C.UTF-8
ENV PATH $GEM_HOME/bin:$BUNDLE_PATH/gems/bin:/usr/lib/fullstaq-ruby/versions/${RUBY_VERSION}/bin:$PATH

RUN apt-get update -qq && DEBIAN_FRONTEND=noninteractive apt-get install -yq --no-install-recommends wget apt-transport-https ca-certificates curl gnupg2

RUN echo "deb https://apt.fullstaqruby.org debian-10 main" > /etc/apt/sources.list.d/fullstaq-ruby.list \
  && curl -SLfO https://raw.githubusercontent.com/fullstaq-labs/fullstaq-ruby-server-edition/master/fullstaq-ruby.asc \
  && apt-key add fullstaq-ruby.asc

RUN curl -sL https://deb.nodesource.com/setup_13.x | bash -

RUN apt-get update -qq \
  && DEBIAN_FRONTEND=noninteractive apt-get upgrade -yq \
  && DEBIAN_FRONTEND=noninteractive apt-get install -yq --no-install-recommends ${REQUIRED_PACKAGES} \
  && apt-get clean \
  && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
  && truncate -s 0 /var/log/*log

RUN mkdir -p $RAILS_ROOT/bin

RUN rbenv global 2.6.5-jemalloc && gem update --system && gem install bundler

WORKDIR $RAILS_ROOT

COPY --from=dev $GEM_HOME $GEM_HOME

COPY --from=dev /app/config.ru /app/Rakefile ./
COPY --from=dev /app/public ./public
COPY --from=dev /app/lib ./lib
COPY --from=dev /app/bin ./bin
COPY --from=dev /app/db ./db
COPY --from=dev /app/config ./config
COPY --from=dev /app/Gemfile* ./
COPY --from=dev /app/app ./app

RUN mkdir -p ./tmp/pids && mkdir -p ./log

As you can see we copu only what’s needed for the app to work, starting from what changes less often.

I think that’s it. This post was requested a few times so here we are. I hope it’s useful and if you have any tips on how to further optimise this, please do let me know!

© Vito Botta