Linstor storage for Kubernetes, the Kubernetes way

Published Saturday, Jan 04 2020

Back in August I published a simple comparison between a few storage solutions for Kubernetes, out of the frustration that I was having with storage with Kubernetes also because of my limited budget. A reader then suggested that I try Linstor as well, since it wasn’t included in the original comparison because I hadn’t heard of it before. I really liked Linstor and it seemed a solid option, so I wrote a following post on how to set up Linstor for Kubernetes. However I ended up using Hetzner Cloud’s (referral link, we both receive credits) “volumes” directly with their CSI driver, since at the time it turned out to be an easier option. I recently switched to NetCup, another cheap German provider, because lately with Hetzner Cloud getting a fast cloud server seemed to have become a lottery, at least in the Nuremberg DC. Sometimes the servers were really fast, some times ridiculously slow and at times I had to delete new servers and recreate them hoping to be luckier. NetCup offers so called “root servers” which are virtual servers with dedicated cores/threads at a very affordable price, so I decided to try them out. So far I like the service and the servers are fast, but unfortunately it is not possible to just dynamically add disks to these servers like with a cloud provider so I am stuck with the storage on the root disk, which is plenty. So once again I found myself trying a few options so to use the storage on the root disk because of this. I like Longhorn, but even with the latest release I had some problems were volumes would get into a faulted state randomly. OpenEBS is still too slow for me, and Rook with Ceph is deprecating the use of directories on mounted file systems for storage, and it doesn’t support partitions as OSDs either. I didn’t try Linstor at first because I thought it would require additional disks as well, but then I found out that LVM (Linstor can use LVM under the hood) can work just fine with loop devices, that is files on a mounted file system that can be used as block devices. So I decided to give this a try and it works pretty well! Besides using loop devices, I made another change compared to when I tried Linstor the first time: instead of installing Linstor manually outside Kubernetes first (and then installing the CSI driver), I am using kube-linstor, an awesome project that allows you to deploy Linstor as containers directly in Kubernetes. From reading Linstor’s user guide, the only other option to install Linstor as containers is through paid support from Linbit, the company that develops the open source Linstor. I was very happy to find this project and it was very easy to set up in Kubernetes and works just as well as a manual installation of Linstor outside Kubernetes. So let’s see how easy it is to set up.

Prerequisite - loop devices

This is optional if you can add disks to your server so you can use those directly. As mentioned earlier, because I can’t add disks I have to use the storage on the root disk and loop devices are handy for this. I am using Ubuntu but the following instructions shouldn’t differ much with other distros. The following commands are to be run on each node.

First, we need to allocate some disk space for the file we are going to use as loop device for LVM:

fallocate -l 300G /linstor.img

We can run losetup /dev/loop0 to create the loop device from this file right away, but we need to have this happen at startup automatically in such a way that LVM can “see” the device. I use a systemd service for this:

cat <<EOF > /etc/systemd/system/linstor-loop-device.service
[Unit]
Description=Activate Linstor loop device
DefaultDependencies=no
After=systemd-udev-settle.service
Before=lvm2-activation-early.service
Wants=systemd-udev-settle.service

[Service]
ExecStart=/sbin/losetup /dev/loop0 /linstor.img
Type=oneshot

[Install]
WantedBy=local-fs.target
EOF

This service ensures that the loop device is mounted/created before LVM starts. To enable and start the new service:

systemctl daemon-reload
systemctl enable --now linstor-loop-device.service

We can then check if the loop device is actually mounted with

losetup -l

You should see something like this:

NAME       SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE    DIO LOG-SEC
/dev/loop0         0      0         0  0 /linstor.img   0     512

Prerequisite - LVM

Once the loop device is available, we need to prepare the device for use with LVM by creating a logical volume. This is pretty easy by running the following commands:

pvcreate /dev/loop0
vgcreate vg /dev/loop0
lvcreate -l 100%FREE  --thinpool vg/lvmthinpool

The --thinpool parameter is quite important because it enables thin provisioning (the ability to provision volumes that are larger than the actual available storage, which can be handy if you add storage as needed).

Installation

Database

Linstor is one or more controllers responsible for the storage operations such as provisioning, plus satellites that handle the actual data. It is possible to use an external database so that we can set up the controller in a HA fashion within Kubernetes. Here we are going to use Postgres for that. This is pretty easy with Helm:

helm upgrade --install \
  --namespace linstor \
  --set superuserPassword=hackme \
  --set replicationPassword=hackme \
  --set persistence.enabled=true,persistence.size=10G \
  --set keeper.replicaCount=3 \
  --set proxy.replicaCount=3 \
  --set sentinel.replicaCount=3 \
    linstor-db \
  stable/stolon

I am not too familiar with stolon but my this is what is suggested in the README of the kube-linstor project. My understanding is that it enables HA Postgres. The database is installed but is not ready yet because the persistent volumes it uses are still missing. We are going to use hostpath volumes for the database, so we need to create these volumes next.

Clone the repo of the kube-linstor project:

git clone [email protected]:kvaps/kube-linstor.git

And run the following to create the persistent volumes:

helm template pv-hostpath \
  --name data-linstor-db-stolon-keeper-0 \
  --namespace linstor \
  --set node=node1,path=/var/lib/linstor-db,size=10Gi \
  > pv1.yaml

helm template pv-hostpath \
  --name data-linstor-db-stolon-keeper-1 \
  --namespace linstor \
  --set node=node2,path=/var/lib/linstor-db,size=10Gi \
  > pv2.yaml

helm template pv-hostpath \
  --name data-linstor-db-stolon-keeper-2 \
  --namespace linstor \
  --set node=node3,path=/var/lib/linstor-db,size=10Gi \
  > pv3.yaml

kubectl create -f pv1.yaml -f pv2.yaml -f pv3.yaml

The database pods will now be correctly initialised once the volumes are ready. We can check that the database is up by connecting to it from one of the pods of the statefulset:

kubectl exec -ti -n linstor linstor-db-stolon-keeper-0 bash
PGPASSWORD=$(cat $STKEEPER_PG_SU_PASSWORDFILE) psql -h linstor-db-stolon-proxy -U stolon postgres

While we are in the Postgres console, we need to create a database and a database user that will be used by the Linstor controller:

CREATE DATABASE linstor;
CREATE USER linstor WITH PASSWORD 'hackme';
GRANT ALL PRIVILEGES ON DATABASE linstor TO linstor;

Linstor

Now that the database is ready, we can install Linstor itself. For my original post about Linstor, I had to set it up manually outside Kubernetes first, and then install just the CSI driver in Kubernetes. Thanks to the kube-linstor project, we can easily install Linstor as containers instead. However, Linstor still requires a kernel module for DRBD, which is used for the replication. On Ubuntu, we can install this kernel module using Linbit’s ppa repository:

add-apt-repository ppa:linbit/linbit-drbd9-stack
apt-get update

apt install -y drbd-dkms drbd-utils

One last step before installing Linstor which may no longer be required by the time you read this. I had to change the name of a cluster role in a manifest because it was hardcoded and therefore it was not found when naming the Helm release “linstor” as I did. Edit helm/kube-linstor/templates/linstor-csi.yaml and change it as follows:

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: linstor-linstor-csi-snapshotter-role

Again, this may be fixed by the time you read this. We can now install Linstor with Helm:

helm upgrade --install \
  --namespace linstor \
  --set controller.db.user=linstor \
  --set controller.db.password=hackme \
  --set controller.db.connectionUrl=jdbc:postgresql://linstor-db-stolon-proxy/linstor \
  --set controller.expose=true \
  --set controller.restApiBindAddress="0.0.0.0" \
  --set stunnel.enabled=false \
  --set csi.image.linstorCsiPlugin.repository=docker.io/kvaps/linstor-csi \
  --set csi.image.linstorCsiPlugin.tag=fix-34 \
  linstor \
  ./kube-linstor

Because I am using a private network between the nodes, I disabled stunnel - which is used for encryption - for better performance. Because of that I had to expose a service for the controller and change the bind address to 0.0.0.0 as explained in values.yaml. As you can see, I am also changing the CSI plugin’s image to docker.io/kvaps/linstor-csi:fix-34- also from the author of the project. This fixes a current issue where two pods on the same volume cannot share a volume mounted on that node. This will likely be fixed soon.

Linstor is now installed (make sure the pods in the linstor namespace start correctly), but we still need to tell Linstor about the nodes and create the storage pools. To create the nodes, exec into the controller from your laptop/desktop:

kubectl exec -ti -n linstor linstor-linstor-controller-0 linstor

Then run:

linstor node create node1 10.0.0.1
linstor node create node2 10.0.0.2
linstor node create node3 10.0.0.3

Of course change the names and IPs as needed. To create the storage pool on each node, run:

linstor storage-pool create lvmthin node1 linstor-pool vg/lvmthinpool
linstor storage-pool create lvmthin node2 linstor-pool vg/lvmthinpool
linstor storage-pool create lvmthin node3 linstor-pool vg/lvmthinpool

Here, we are ensuring that thin provisioning with LVM is enabled, and we are using the logical volume we created earlier. One last step before Linstor can be used with Kubernetes, is to create the storage classes. I usually create one storage class with three replicas for most things, and a storage class with a single replica for use with things like MySQL that already handle replication:

kubectl apply -f - <<EOF
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: linstor
provisioner: linstor.csi.linbit.com
parameters:
  autoPlace: "3"
  storagePool: "linstor-pool"
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: linstor-single-replica
provisioner: linstor.csi.linbit.com
parameters:
  autoPlace: "1"
  storagePool: "linstor-pool"
EOF

You should now be ready to create persistent volume claims. There still is a little more setup than with other storage options, but it’s nothing complex and it takes longer to write about it than actually doing it. One last note about performance. I have a 1Gb/s connection between the nodes and unfortunately I cannot upgrade, so while replication speed is generally OK, I did notice an occasional lag when making changes in my app that would trigger writes in the storage. Because of that, I have switched to the asynchronous protocol in Linstor (it supports synchronous/semi-synchronous/asynchronous replication) by running the following command in the controller’s shell:

controller drbd-options --protocol A --after-sb-0pri=discard-zero-changes  --after-sb-1pri=discard-secondary  --after-sb-2pri=disconnect

The additional parameters were suggested to me by the author of the kube-linstor project to fix an issue with provisioning with this protocol. If you also have a slow network and are thinking of using the async protocol, keep in mind that in the case of the primary node going down, there may be a small data loss on the replicas. In my case this is fine because I don’t have anything so “critical”, YMMV.

That’s it. Like I said there are still several steps but it’s definitely easier than the installation I showed in the previous post about Linstor, and it only needs to be done once. The upside is that you have a solid storage option for Kubernetes powered by battle-tested building blocks like LVM and DRBD. So far it’s looking good for me so I hope I don’t encounter any significant issues because otherwise I will have to change provider again :p

Previous: Getting the user’s real IP with haproxy ingress behind Cloudflare

Next: Secure Kubernetes on Hetzner Cloud with a node driver for Rancher