Kubernetes over Wireguard VPN with RancherOS

Published Wednesday, Jul 17 2019

In a previous post, I explained how I set up RancherOS for Rancher and Kubernetes; in this post, I’ll show how to secure the inter-host communication between Kubernetes nodes running on RancherOS, by encrypting all the traffic with a Wireguard VPN.

I am still learning but my understanding is that Kubernetes by default does not encrypt the traffic between the nodes. I’ve read that something like Istio can secure connections between services and that the Weave network plugin can be configured to do some encryption, but I am not sure of how these work as I haven’t tried them yet; anyway I was looking for a solution that would encrypt everything between the nodes at a lower level than Kubernetes, so that it could be used with anything. Rancher by default deploys Kubernetes with Canal (read this page for more details on the various network plugins available), so the instructions in this post will assume this configuration; the changes required for the specific network configuration are minimal though, and can be adapted if you want to use another network plugin.

As for the VPN, I choose Wireguard because it’s lightweight and it’s a lot faster and even easier to set up than others (looking at you, OpenVPN). It requires a tiny bit of configuration to set up a peer-to-peer VPN (as opposed to a server/client VPN like OpenVPN), but it also requires a kernel module, which is what makes it so fast besides the small and optimised codebase. There are plans to include the module into the Linux kernel directly, but for now you need to build and install it in the OS.

Here I am assuming RancherOS as the operating system, so we’ll see how to build the module specifically for this OS. This is easy to do with most operating systems though, take a look at the installation instructions on Wireguard’s website.

The solution I’ll show here is based on a script to deploy the Wireguard configuration to the hosts, and a Docker image that builds the Wireguard module and sets up the VPN on each host.

Creating a test environment

You will likely want to test this configuration with a local cluster so I recommend docker-machine for this because it makes it very easy and quick to create VMs with Virtualbox.

So for starters make sure you have both Docker and Virtualbox installed on your work machine, then download the latest release of RancherOS (currently v1.5.3):

wget -O ~/Downloads/rancheros.iso \
  https://github.com/rancher/os/releases/download/v1.5.3/rancheros.iso

Next, create a few RancherOS VMs with the following command:

docker-machine create  \
  --driver virtualbox \
  --virtualbox-cpu-count "2" \
  --virtualbox-memory "4096" \
  --virtualbox-no-share \
  --virtualbox-boot2docker-url file://$HOME/Downloads/rancheros.iso <name>

I would recommend at least two vCPUs and a few GBs of ram for each VM, but adjust these settings according to the resources available on your computer. Also, I recommend an odd number of VMs (perhaps 3) and of course more than one, because for our testing I am assuming we’ll be setting up each node with all the roles (controlplane, etcd, worker) for simplicity, when we deploy Kubernetes with Rancher. We have downloaded the RancherOS ISO and are using the file’s location on disk as boot2docker-url because this way we need to download the ISO only once for all the VMs. Choose whatever you want as the name for each VM, I would choose something like node1, node2, node3.

To SSH into the VMs, you can use the command docker-machine ssh , but I prefer to copy my default SSH key and SSH into the VMs directly. To do this, run the following for each VM:

docker-machine \
  ssh node1 'sudo ros config set ssh_authorized_keys ["ssh-rsa ..."] && sudo reboot'

docker-machine ip <node name>

Ensure you set your correct pub key. The commands above will add your SSH key to the Rancher user and then print the IP of the VM.

We are using ros config as that’s how we can quickly change RancherOS’ configuration without editing configuration files. You can SSH into the VMs with

ssh rancher@<node ip>

You now have a few VMs to play with.

Deploying the Wireguard configuration

Configuring Wireguard is easy (see this article for example), but to make it even easier I wrote a script that generates a keypair and the config for each host, and then deploys the configuration to the hosts automatically. This script and the Dockerfile for the image we will be using shortly can be found in the repo on Github.

The script requires a few tools installed on your work computer in order to work. To install them e.g. on a Mac, you can run

brew install nmap awk ipcalc wireguard-tools

I am using nmap and ipcalc to easily validate and generate IPs from the choosen subnet. To generate and deploy the configuration to the hosts, you can run:

curl https://raw.githubusercontent.com/vitobotta/docker-wireguard/master/deploy-config.sh \
  | bash /dev/stdin --hosts 192.168.99.107,192.168.99.108,192.168.99.109

Of course review the script first if you wish and replace the IPs of the hosts with the IPs of your VMs. You can optionally specify the user for the SSH connection with --ssh-user (default: “rancher”), the subnet to use for the VPN with --subnet (default: 192.168.37.0/24), the name of the Wireguard network interface with --wg-interface (default: “wg0”), and the port Wireguard should listen on with --listen-port (default: 51820). If you are happy with the defaults then the command above will suffice.

The script will generate a key pair for each host, and deploy the Wireguard configuration to $HOME/wireguard/.conf, so by default it will be /home/rancher/wireguard/wg0.conf.

For example the configuration for node1 assuming the defaults would like something like this:

[Interface]
Address = 192.168.99.107
PrivateKey = <private key of node1>
ListenPort = 51820
SaveConfig = true

[Peer]
PublicKey = <public key node2>
Endpoint = 192.168.99.108:51820
AllowedIPs = 192.168.37.2/32

[Peer]
PublicKey = <public key node3>
Endpoint = 192.168.99.109:51820
AllowedIPs = 192.168.37.3/32

Preparing RancherOS for the kernel module

As I mentioned earlier, Wireguard requires a kernel module to work, so we need to build and install it. In order to do this, I created a Docker image: when the container starts, it checks if the wireguard kernel module is already loaded, if it’s not, it goes ahead and builds/installs the module into the kernel, then starts the VPN creating the new network interface (wg0 by default).

In order to build custom modules on RancherOS, we need to enable the kernel headers; of course this is also done with containers in RancherOS. Run the following:

sudo ros service enable kernel-headers
sudo ros service up kernel-headers

Next, we need to make sure the wireguard kernel module is loaded at boot, automatically:

sudo ros config set rancher.modules "['wireguard']"

Setting up the VPN as service

To make sure the VPN is started automatically at boot, we need to configure a service. First, create a config file:

cat <<EOD > /var/lib/rancher/conf/wireguard.yml
wireguard:
  image: vitobotta/docker-wireguard:0.15.0
  net: host
  privileged: true
  restart: always
  volumes:
  - /home/rancher/wireguard:/etc/wireguard
  - /usr/src:/usr/src
  - /lib/modules:/lib/modules
  environment:
    INTERFACE: "wg0"
    LISTEN_PORT: "51820"
EOD

Remember to change the interface and the port if you are not using the default settings. Finally, enable the service:

sudo ros service enable /var/lib/rancher/conf/wireguard.yml
sudo ros service up wireguard

RancherOS will start the container right away, and the container will see that the wireguard module is not loaded and will proceed with the build. Soon after, the VPN is started and the Wireguard network interface created.

Testing the connections

Testing is easy, just ping the other hosts from each host, e.g. from node1:

root@node1:~# ping -c 3 192.168.37.2
PING 192.168.37.2 (192.168.37.2) 56(84) bytes of data.
64 bytes from 192.168.37.2: icmp_seq=1 ttl=64 time=0.703 ms
64 bytes from 192.168.37.2: icmp_seq=2 ttl=64 time=0.688 ms
64 bytes from 192.168.37.2: icmp_seq=3 ttl=64 time=0.589 ms

--- 192.168.37.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2026ms
rtt min/avg/max/mdev = 0.589/0.660/0.703/0.050 ms

If all is working, you should be able to ping between hosts. Yay! Now reboot and try again to make sure the kernel module is loaded at boot time.

In order for Kubernetes to work properly over the VPN, we need to do two things:

ensure the network plugin (Canal by default) uses the Wireguard network interface
specify the public and internal IPs when setting up the nodes of the cluster.

In Rancher, create a a new cluster with ‘custom’ nodes, give it a name and edit the YAML configuration for the cluster by clicking on Edit as YAML. In the code editor, change the network section as follows:

network: 
  canal_network_provider: 
    iface: "wg0"
  options: 
    flannel_backend_type: "vxlan"
  plugin: "canal"

The important bit here is that you set the correct Wireguard interface for Canal. Click Next, check all the roles under Node Role and set:

Node Address to the public IP for node1
Internal Address to the IP assigned by the VPN to node1 (the IP of the wg0 interface basically)
Node Name to ‘node1’.

Copy the command generated to the clipboard, and run it the terminal for node1. Do the same for node2 and node3 making sure the IPs are correct. For example, for node1 in my test cluster, I had the settings in the picture below:

You now have a Kubernetes cluster running on top of a Wireguard VPN.

Testing with a deployment

To make sure everything is working fine, deploy an instance of e.g. Nginx, and create an ingress for it. You should be able to see Nginx’s default page when you open the hostname in your browser.

Wrapping up

It took me one day and a half to figure this out, but I am happy with the outcome. It took longer to write this blog post than it takes to actually set things up. Unfortunately the Docker image is a little heavy, since it’s based on Ubuntu and includes the stuff needed to build the kernel module; I tried with Alpine but I didn’t manage to get it working due to problems with dependencies etc, so it was just easier with Ubuntu. Please let me know in the comments if you try this and run into any problems, but it’s quite straightforward and I’ve tested it several times so hopefully it will work out of the box.

Previous: A few tips for OpenEBS

Next: Storage on Kubernetes: OpenEBS vs Rook (Ceph) vs Rancher Longhorn vs StorageOS vs Robin vs Portworx vs Linstor