Highly available, external load balancer for Kubernetes in Hetzner Cloud using haproxy and keepalived

Published Friday, Mar 20 2020

Update: Hetzner Cloud now offers load balancers, so this is no longer required. Check their website for more information.

I am working on a Rails app that allows users to add custom domains, and at the same time the app has some realtime features implemented with web sockets. I’m using the Nginx ingress controller in Kubernetes, as it’s the default ingress controller and it’s well supported and documented. Unfortunately, Nginx cuts web sockets connections whenever it has to reload its configuration. When a user of my app adds a custom domain, a new ingress resource is created triggering a config reload, which causes disruptions with the web sockets connections. There are other ingress controllers like haproxy and Traefik which seem to have a more dynamic reconfiguration than Nginx, but I prefer using Nginx.

So one way I figured I could prevent Nginx’s reconfiguration from affecting web sockets connections is to have separate deployments of the ingress controller for the normal web traffic and for the web sockets connections. This way, when the Nginx controller for the normal http traffic has to reload its configuration, web sockets connections are not interrupted. To have multiple deployments of the Nginx controller in the same Kubernetes cluster, the controller has to be installed with a NodePort service or a LoadBalancer service. Unfortunately my provider Hetzner Cloud (referral link, we both receive credits), while a great service overall at competitive prices, doesn’t offer a load balancer service yet, so I cannot provision load balancers from within Kubernetes like I would be able to do with bigger cloud providers.

Because of this, I decided to set up a highly available load balancer external to Kubernetes that would proxy all the traffic to the two ingress controllers. I did this using by installing the two ingress controller with a service of type NodePort, and setting up two nodes with haproxy as the proxy and keepalived with floating IPs, configured in such a way that there is always one load balancer active. This way, if one load balancer node is down, the other one becomes active within 1-2 seconds with minimal to no downtime for the app. In this post, I am going to show how I set this up for other customers of Hetzner Cloud who also use Kubernetes. Please note that if you only need one ingress controller, this is not really needed. You could just use one ingress controller configured to use the host ports directly.

Provisioning

The first thing you need to do, is create two servers in Hetzner Cloud that will serve as the two load balancers. It’s important that you name these severs lb1 and lb2 if you are following along with my configuration, to make scripts etc easier. You can use the cheapest servers since the load will be pretty light most of the time unless you have a lot of traffic; I suggest servers with Ceph storage instead of NVMe because over the span of several months I found that the performance, while lower, is kinda more stable - but up to you of course.

You will also need to create one or more floating IPs depending on how many ingress controllers you want to load balance with this setup. In my case I have two floating IPs, one for the ingress that handles normal http traffic, and the other for the ingress that handles web sockets connections. The names of the floating IPs are important and must match those specified in a script we’ll see later - in my case I have named them http and ws. keepalived will ensure that these floating IPs are always assigned to one load balancer at any time. You’ll need to configure the DNS settings for your apps to use these floating IPs instead of the IPs of the cluster nodes.

In order for the floating IPs to work, both load balancers need to have the main network interface eth0 configured with those IPs. On Debian system, you need to create a config file as follows (all the steps from now on myst be executed on each load balancer):

cat > /etc/network/interfaces.d/60-my-floating-ip.cfg <<EOF
auto eth0:1
iface eth0:1 inet static
    address <floating IP 1>
    netmask 32

auto eth0:2
iface eth0:2 inet static
    address <floating IP 2>
    netmask 32
EOF

Then you need to restart the networking service to apply this configuration:

sudo service networking restart

If you use a CentOS/RedHat system take a lot at this page.

Installing keepalived

We’ll install keepalived from source because the version bundled with Ubuntu is old. First you need to install some dependencies so that you can compile the software:

apt update
apt-get install build-essential libssl-dev

Then you can compile and install:

cd ~
wget  [http://www.keepalived.org/software/keepalived-2.0.20.tar.gz](http://www.keepalived.org/software/keepalived-2.0.20.tar.gz)
tar xzvf keepalived*
cd keepalived-2.0.20

./configure

make
sudo make install

Next, we need to create a service:

cat > /etc/systemd/system/keepalived.service <<EOF
#
# keepalived control files for systemd
#
# Incorporates fixes from RedHat bug #769726.
[Unit]
Description=LVS and VRRP High Availability monitor
After=network.target
ConditionFileNotEmpty=/etc/keepalived/keepalived.conf
[Service]
Type=simple
# Ubuntu/Debian convention:
EnvironmentFile=-/etc/default/keepalived
ExecStart=/usr/local/sbin/keepalived --dont-fork
ExecReload=/bin/kill -s HUP $MAINPID
# keepalived needs to be in charge of killing its own children.
KillMode=process
[Install]
WantedBy=multi-user.target
EOF

and enable it:

sudo systemctl enable keepalived

Finally, we need a configuration file that will differ slightly between the primary load balancer (MASTER) and the secondary one (BACKUP). On the primary LB:

cat > /etc/keepalived/keepalived.conf <<EOF
global_defs {
  script_user root
  enable_script_security
}
vrrp_script chk_haproxy {
  script "/usr/bin/pgrep haproxy"
  interval 2
}
vrrp_instance VI_1 {
  interface eth0
  state MASTER
  priority 200
  virtual_router_id 33
  unicast_src_ip <IP of the primary load balancer>
  unicast_peer {
    <IP of the secondary load balancer>
  }
  authentication {
    auth_type PASS
    auth_pass <a password - max 8 characters - that we'll be used by keepalived instances to communicate with each other>
  }
  track_script {
    chk_haproxy
  }
  notify_master /etc/keepalived/master.sh
}
EOF

On the secondary LB:

cat > /etc/keepalived/keepalived.conf <<EOF
global_defs {
  script_user root
  enable_script_security
}
vrrp_script chk_haproxy {
  script "/usr/bin/pgrep haproxy"
  interval 2
}
vrrp_instance VI_1 {
  interface eth0
  state BACKUP
  priority 100
  virtual_router_id 33
  unicast_src_ip <IP of the secondary load balancer>
  unicast_peer {
    <IP of the primary load balancer>
  }
  authentication {
    auth_type PASS
    auth_pass <same password as before>
  }
  track_script {
    chk_haproxy
  }
  notify_master /etc/keepalived/master.sh
}
EOF

Note that we are going to use the script /etc/keepalived/master.sh to automatically assign the floating IPs to the active node. By “active”, I mean a node with haproxy running - either the primary, or if the primary is down, the secondary.

Before the master.sh script can work, we need to install the Hetzner Cloud CLI. This is a handy (official) command line utility that we can use to manage any resource in an Hetzner Cloud project, such as floating IPs.

To install the CLI, you just need to download it and make it executable:

cd ~
wget https://github.com/hetznercloud/cli/releases/download/v1.16.1/hcloud-linux-amd64.tar.gz
tar xvfz hcloud-linux-amd64.tar.gz
chmod +x hcloud

Then we can create the script:

cat > /etc/keepalived/master.sh << 'EOF'
#!/bin/bash
export HCLOUD_TOKEN='<a token you need to create in the Hetzner Cloud project that has the load balancer servers and the floating IPs>'

ME=`hcloud server describe $(hostname) | head -n 1 | sed 's/[^0-9]*//g'`

HTTP_IP_CURRENT_SERVER_ID=`hcloud floating-ip describe http | grep 'Server:' -A 1 | tail -n 1 | sed 's/[^0-9]*//g'`
WS_IP_CURRENT_SERVER_ID=`hcloud floating-ip describe ws | grep 'Server:' -A 1 | tail -n 1 | sed 's/[^0-9]*//g'`

if [ "$HTTP_IP_CURRENT_SERVER_ID" != "$ME" ] ; then
  n=0
  while [ $n -lt 10 ]
  do
    hcloud floating-ip assign http $ME && break
    n=$((n+1))
    sleep 3
  done
fi

if [ "$WS_IP_CURRENT_SERVER_ID" != "$ME" ] ; then
  n=0
  while [ $n -lt 10 ]
  do
    hcloud floating-ip assign ws $ME && break
    n=$((n+1))
    sleep 3
  done
fi
EOF

The script is pretty simple. All it does is check if the floating IPs are currently assigned to the other load balancer, and if that’s the case assign the IPs to the current load balancer. Specifically, this script will be executed on the primary load balancer if haproxy is running on that node but the floating IPs are assigned to the secondary load balancer; or on the secondary load balancer, if the primary is down.

Don’t forget to make the script executable:

chmod +x /etc/keepalived/master.sh

Then restart keepalived:

service keepalived restart

haproxy

haproxy is what takes care of actually proxying all the traffic to the backend servers, that is, the nodes of the Kubernetes cluster. Each Nginx ingress controller needs to be installed with a service of type NodePort that uses different ports. For example, for the ingress controller for normal http traffic I use the port 30080 for the port 80 and 30443 for the port 443; for the ingress controller for web sockets, I use 31080 => 80, and 31443 => 443.

First, we need to install haproxy:

apt install haproxy

Then we need to configure it with frontends and backends for each ingress controller. To create/update the config, run:

cat > /etc/haproxy/haproxy.cfg << 'EOF'
global
        log /dev/log    local0
        log /dev/log    local1 notice
        chroot /var/lib/haproxy
        stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
        stats timeout 10s
        user haproxy
        group haproxy
        daemon
        maxconn 10000

        # Default SSL material locations
        ca-base /etc/ssl/certs
        crt-base /etc/ssl/private

        # Default ciphers to use on SSL-enabled listening sockets.
        # For more information, see ciphers(1SSL). This list is from:
        #  https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
        # An alternative list with additional directives can be obtained from
        #  https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy
        ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
        ssl-default-bind-options no-sslv3

defaults
        log     global
        mode    tcp
        option  tcplog
        option  dontlognull
        timeout connect 5000
        timeout client  10000
        timeout server  10000
        errorfile 400 /etc/haproxy/errors/400.http
        errorfile 403 /etc/haproxy/errors/403.http
        errorfile 408 /etc/haproxy/errors/408.http
        errorfile 500 /etc/haproxy/errors/500.http
        errorfile 502 /etc/haproxy/errors/502.http
        errorfile 503 /etc/haproxy/errors/503.http
        errorfile 504 /etc/haproxy/errors/504.http

frontend http
   mode tcp
   bind <floating IP for http traffic>:80
   option tcplog
   default_backend http

frontend https
   mode tcp
   bind <floating IP for http traffic>:443
   option tcplog
   default_backend https

backend http
   balance roundrobin
   mode tcp
   server http1 <IP of Kubernetes cluster node 1>:30080 check send-proxy-v2
   server http2 <IP of Kubernetes cluster node 2>:30080 check send-proxy-v2
   ...
   server httpN <IP of Kubernetes cluster node N>:30080 check send-proxy-v2

backend https
   balance roundrobin
   mode tcp
   option ssl-hello-chk
   server http1 <IP of Kubernetes cluster node 1>:30443 check send-proxy-v2
   server http2 <IP of Kubernetes cluster node 2>:30443 check send-proxy-v2
   ...
   server httpN <IP of Kubernetes cluster node N>:30443 check send-proxy-v2

frontend ws
   mode tcp
   bind <floating IP for web sockets>:80
   option tcplog
   default_backend ws

frontend wss
   mode tcp
   bind <floating IP for web sockets>:443
   option tcplog
   default_backend wss

backend ws
   balance roundrobin
   mode tcp
   server ws1 <IP of Kubernetes cluster node 1>:31080 check send-proxy-v2
   server ws2 <IP of Kubernetes cluster node 2>:31080 check send-proxy-v2
   ...
   server wsN <IP of Kubernetes cluster node N>:31080 check send-proxy-v2

backend wss
   balance roundrobin
   mode tcp
   option ssl-hello-chk
   server ws1 <IP of Kubernetes cluster node 1>:31443 check send-proxy-v2
   server ws2 <IP of Kubernetes cluster node 2>:31443 check send-proxy-v2
   ...
   server wsN <IP of Kubernetes cluster node N>:31443 check send-proxy-v2
EOF

A few important things to note in this configuration:

mode is set to tcp. This is required to proxy “raw” traffic to Nginx, so that SSL/TLS termination can be handled by Nginx;
send-proxy-v2 is also important and ensures that information about the client including the source IP address are sent to Nnginx, so that Nginx can “see” the actual IP address of the user and not the IP address of the load balancer. Remeber to set use-proxy-protocol to true in the ingress configmap.

Finally, you need to restart haproxy to apply these changes:

service haproxy restart

If all went well, you will see that the floating IPs will be assigned to the primary load balancer automatically - you can see this from the Hetzner Cloud console. To ensure everything is working properly, shutdown the primary load balancer: the floating IPs should be assigned to the secondary load balancer. When the primary is back up and running, the floating IPs will be assigned to the primary once again. The switch takes only a couple seconds tops, so it’s pretty quick and it should cause almost no downtime at all.

Somehow I wish I could solve my issue directly within Kubernetes while using Nginx as ingress controller, or better that Hetzner Cloud offered load balancers, but this will do for now. Perhaps I should mention that there is another option with the Inlets Operator, which takes care of provisioning an external load balancer with DigitalOcean (referral link, we both receive credits) or other providers, when your provider doesn’t offer load balancers or when your cluster is on prem or just on your laptop, not exposed to the Internet. It’s an interesting option, but Hetzner Cloud is not supported yet so I’d have to use something like DigitalOcean or Scaleway with added latency; plus, I couldn’t find some information I needed in the documentation and I didn’t have much luck asking for this information. Load balancers provisioned with Inlets are also a single point of failure, because only one load balancer is provisioned in a non-HA configuration.

For now, this setup with haproxy and keepalived works well and I’m happy with it. It’s cheap and easy to set up and automate with something like Ansible - which is what I did.

Previous: Postgres on Kubernetes with the Zalando operator

Next: Optimised Docker builds for Rails apps