Setting up RancherOS for Rancher and Kubernetes


So while looking for alternatives I’ve also tried CentOS - which is known to be more stable - but to be honest I never liked it too much. Then I came across the concept of “container optimised” operating systems, and learnt about RancherOS - surprisingly late considering that I had already been using Rancher for a little while (in case it’s the first time you hear about Rancher, it’s an awesome management interface for Kubernetes clusters). RancherOS is a special kind of operating system, in that everything - really - runs as containers, including system services. It mainly consists of two separate Docker instances, a System Docker for OS/system related stuff, and a User Docker for user managed containers. RancherOS is an OS made of containers for use with containers, so it sounds like the perfect choice for Kubernetes but also for Rancher, which I use (and absolutely love) to manage it.
RancherOS is a super lightweight operating system with just the minimum components required to run Docker. Not only is it lightweight, a minimal OS also translates in smaller attack surface out of the box. I’ve been running RancherOS only for a few days but I love it already once I sorted out a few things and generally understood better how it works.
Here I am going to show how to install it first, then will give a couple of tips for things to do once the OS is installed. For my servers I use Hetzner Cloud (referral link, we both receive credits) because of the amazing price/performance, but the instructions below can be easily adapted for other providers.
Installation
Once the server has been created, go to Rescue in the server’s control panel and enable the rescue system by clicking on Enable Rescue & Power Cycle. Within a minute you should be able to SSH into the server’s rescue system with
ssh root@<server ip>
Once in the rescue system, you need to install the kexec-tools package, which is required to boot into another kernel different from the one currently running. Here I am assuming the original OS is Ubuntu.
DEBIAN_FRONTEND=noninteractive apt-get install --assume-yes --show-progress kexec-tools
Next, download the RancherOS ISO - you can check the latest release available here. At the moment it is 1.5.2.
wget https://github.com/rancher/os/releases/download/v1.5.2/rancheros.iso
Wipe the disk
echo -e "w\nq" | fdisk /dev/sda
mount the ISO
mount -t iso9660 rancheros.iso /mnt
and boot into it
kexec --initrd /mnt/boot/initrd-v1.5.2 --command-line="rancher.password=some-password" /mnt/boot/vmlinuz-4.14.122-rancher
Of course set a proper password. The SSH connection should be interrupted so you need to SSH again forcing the password authentication (I am not sure/can’t remember if this is actually required):
ssh -o PubkeyAuthentication=no -o PreferredAuthentications=password rancher@<server ip>
You will be logged in to RancherOS now. Next you need to prepare the configuration file that will be used by the installer. First set the hostname
HOSTNAME=...
and the IP address of eth0
IP=`ifconfig eth0 | grep 'inet addr:' | cut -d: -f2 | awk '{ print $1}'`
If you have added a volume to the server, set the DISK variable too so it can be used for mounting
DISK=`ls /dev/disk/by-id/scsi-0HC*`
The above will fine the correct disk/volume device. Finally, create the config file:
cat <<EOF > cloud-config.yml #cloud-config hostname: $HOSTNAME ssh_authorized_keys: - ... mounts: - ["$DISK", "/mnt/my-disk", "xfs", ""] rancher: console: ubuntu resize_device: /dev/sda docker: tls: true network: post_cmds: - dhcpcd -e force_hostname=true eth0 dns: nameservers: - 8.8.8.8 - 1.1.1.1 interfaces: eth0: address: $IP/32 netmask: 255.255.255.255 gateway: 172.31.1.1 pointopoint: 172.31.1.1 mtu: 1400 dhcp: false lo: address: 127.0.0.1/8 EOF
Of course set your SSH key(s). You can remove the mounts section if you haven’t added a volume to your server. You can see that I am specifying a console here, this is because RancherOS by default uses an Alpine-based console, but you can choose to use something else like Ubuntu/Fedora/CentOS. Please note that if you want persistence, you need to switch to from the default console to another one. Also, the configuration makes the chosen console available, but we’ll need to switch to it manually as we’ll see in a moment. resize_device is required to ensure that the filesystem created by RancherOS takes the whole capacity of the main disk when installing the OS. The network settings for eth0 here are specific to Hetzner Cloud, so you will have to change them if you are using another provider.
Once you have created the config file, it’s a good idea to validate it just in case there are mistakes:
sudo ros config validate -i cloud-config.yml
Now we are ready to install RancherOS on disk:
sudo ros install -i rancher/os:v1.5.2 -t gptsyslinux -c cloud-config.yml -d /dev/sda --append "rancher.password=some-password"
Again, set a proper password. The installer will reboot the system once you confirm; once you are logged in again, set up Docker TLS support by running:
IP=`ifconfig eth0 | grep 'inet addr:' | cut -d: -f2 | awk '{ print $1}'` sudo ros config set rancher.docker.tls true sudo ros tls gen --server -H localhost -H rancher -H $IP sudo system-docker restart docker sudo ros tls gen
Next, unless you have removed the console setting, switch to the chosen console
sudo ros console switch ubuntu
This will kick you out so you’ll have to login again, then you will be able to install packages with apt if you chose the Ubuntu console, or equivalent for another console. Congrats, RancherOS is now installed on disk.
Post-installation
SSH configuration
Firewall
docker run --name firewall --env OPEN_PORTS="22,80,443" --env ACCEPT_ALL_FROM="ip1,ip2" --env CHAIN="DOCKER-FIREWALL" -itd --restart=always --cap-add=NET_ADMIN --net=host vitobotta/docker-firewall:0.1.0
Of course customise the ports you want to open and the IP addresses, if any, that should be allowed full communication with the server - I use this for example to allow communication between the nodes of a Kubernetes cluster. You can see the Dockerfile and the script here.
fail2ban
mkdir -p fail2ban/jail.d cat <<EOF > fail2ban/jail.d/sshd.conf [sshd] enabled = true port = ssh filter = sshd[mode=aggressive] logpath = /var/log/syslog bantime = 86400 findtime = 14400 maxretry = 3 EOF
Customise the settings if needed. I am annoyed by the many attempts to login to my servers, so here I chose to ban for one whole day any IP that fails a login 3 times within 4 hours.
docker run -it -d --name fail2ban --restart always \ --network host \ --cap-add NET_ADMIN \ --cap-add NET_RAW \ -v $(pwd)/fail2ban:/data \ -v /var/log:/var/log:ro \ -e F2B_LOG_LEVEL=DEBUG \ -e F2B_IPTABLES_CHAIN=INPUT \ -e F2B_ACTION="%(action_mwl)s" \ -e TZ=EEST \ -e F2B_DEST_EMAIL=... \ -e F2B_SENDER=... \ -e SSMTP_HOST=... \ -e SSMTP_PORT=... \ -e SSMTP_USER=... \ -e SSMTP_PASSWORD=... \ -e SSMTP_TLS=YES \ crazymax/fail2ban:latest
I have chosen action_mwl as action so whenever an IP is banned, I receive a notification that includes whois details on the IP.