Self hosted Nextcloud in Kubernetes with S3 as primary storage


Introduction
Various services exist providing cloud storage, file syncing/sharing and collaboration, with the most popular ones being Dropbox, Google Drive and Microsoft OneDrive/Office 365. Dropbox has some additional features but the core focus is file syncing and sharing. The other two offer many more features, including realtime collaborative document editing and videoconferencing. These services are inexpensive but there are some implications concerning privacy when using them.
Nextcloud is one of the most popular (if not the most popular) self hosted/on-premises platforms for file sharing, collaboration and more. Unlike with the aforementioned centralized cloud storage services, your data remains under your control and you can access it and sync it with a variety of devices through a web interface as well as desktop and mobile clients. It's open source and totally free to use (with optional paid support and an enterprise edition if you need it), and very rich in features, especially with the many add-ons (or "apps") available that extend the platform with custom functionality.
I particularly love the integration with OnlyOffice and Collabora Online, which makes Nextcloud a good self hosted alternative to Google Drive and Office 365 for realtime, collaborative document editing. I use OnlyOffice because it's lighter, with a better interface and much better compatibility with Microsoft's file formats. Other commonly used features are calendars, contacts, and the Talk app that adds instant messaging and video conferencing to the platform. But really there's a ton of features you can get by installing the many apps available. One feature that many people would like and that is missing is email hosting; Nextcloud offers a Mail app but that's just a client, so you will need to use something else to host your email.
Nextcloud is a fork of another open source project called OwnCloud and written in PHP, but it has been evolving on its own for a while now and has become more popular than OwnCloud itself. It focuses on privacy of your data (which is especially important in GDPR-land) and ease of use, as well as enterprise grade security and control.
Several managed hosting services for Nextcloud exist - I was previously using Hetzner's "Storage Share" service - but most people will likely prefer hosting it themselves, and that's what I am doing now too.
There are various ways to install Nextcloud on your servers, but in this brief post we'll see how to deploy it in Kubernetes using the official Helm chart. In most cases, Nextcloud is configured to use local or network based block storage to store the data, but I prefer using an S3 compatible object storage service such as Wasabi. As one might expect, there is a performance hit compared to block storage if you have a lot of data to sync, but it is usually noticeable only when you sync all the data the first time. Subsequent incremental sync is usually reasonably fast. The advantage with an object storage service is that your storage scales with your data seamlessly, so you don't have to increase storage capacity (like resizing persistent volumes in Kubernetes) as the amount of data grows. Object storage is also typically cheaper than block storage.
Deployment in Kubernetes
The reason why I am writing this post is that the Helm chart is confusing, and it took me some trial and error to get a fully working deployment. Some things from the reference values.yaml file are missing, creating confusion and leading to errors and things that don't work properly or not at all.
So in this post I'm basically just sharing what I changed from the default values.yaml in the repo, so to save you some time.
cronjob
Nextcloud needs to perform various maintenance tasks regularly (see this page for details), and that's done with a task scheduled with cron. The Helm chart creates a Kubernetes cronjob for this, so you need to make sure you enable it with a schedule so that the jobs run frequently, say every 5 minutes:
cronjob:
annotations: {}
curlInsecure: false
enabled: true
failedJobsHistoryLimit: 5
image: {}
schedule: '*/5 * * * *'
successfulJobsHistoryLimit: 2
horizontal pod autoscaler (HPA)
By default, the Helm chart configures an HPA so to scale pods for the main Nextcloud deployment as the load increases. That's a problem if you use a ReadWriteOnce persistent volume for Nextcloud, since that volume cannot be shared by multiple pods. So if you want to be able to scale Nextcloud you will have to use some storage that supports the ReadWriteMany access mode. To keep things simple, and because I only need my Nextcloud instance for a few users, I prefer disabling the HPA and keep just one pod running. Your mileage may vary. To disable the HPA change the settings as follows:
hpa:
cputhreshold: 60
enabled: false
maxPods: 10
minPods: 1
image
At the time of this writing, the latest version Helm chart still installs Nextcloud 19, while the latest release is 20.0.7. To install the latest release you can override the image tag:
image:
pullPolicy: IfNotPresent
repository: nextcloud
tag: 20.0.7-apache
ingress
In most cases, you will want to access your Nextcloud instance with an ingress resource, with an SSL/TLS certificate issued with Let's Encrypt using cert-manager (which is the most common configuration). You'll need to add a couple of annotations and the TLS settings for that:
ingress:
annotations:
cert-manager.io/cluster-issuer: "issuer or cluster issuer"
kubernetes.io/tls-acme: "true"
enabled: true
labels: {}
tls:
- hosts:
- nextcloud.domain.com
secretName: nextcloud-tls
Make sure you specify the correct name of your cert-manager issuer as well as the hostname you want to use to access Nextcloud. Also make sure you configure a DNS record for that hostname pointing to your cluster before deploying the Helm chart, so that the certificate can be provisioned more quickly.
database
You have three options for the database used by Nextcloud: an "internal" database powered by Sqlite, MariaDB or PostgreSQL. I prefer MariaDB for this so that's what I enable and configure here:
internalDatabase:
enabled: false
mariadb:
db:
name: nextcloud
password: db-password
user: nextcloud
enabled: true
master:
persistence:
accessMode: ReadWriteOnce
enabled: true
size: 8Gi
replication:
enabled: false
rootUser:
password: root-db-password
forcePassword: true
postgresql:
enabled: false
metrics
This is optional if you want to be able to monitor metrics with Prometheus/Grafana:
metrics:
enabled: true
https: false
image:
pullPolicy: IfNotPresent
repository: xperimental/nextcloud-exporter
tag: v0.3.0
replicaCount: 1
service:
annotations:
prometheus.io/port: '9205'
prometheus.io/scrape: 'true'
labels: {}
type: ClusterIP
timeout: 5s
Nextcloud config files
Nextcloud uses a main config.php file for its configuration, but with the Helm chart you can use some custom config files to organize the configuration more easily. The first file is custom.config.php and is configured with this yaml:
nextcloud:
configs:
custom.config.php: |-
<?php
$CONFIG = array (
'overwriteprotocol' => 'https',
'overwrite.cli.url' => 'https://nextcloud.domain.com',
'filelocking.enabled' => 'true',
'loglevel' => '2',
'enable_previews' => false
);
The first two settings overwriteprotocol and overwrite.cli.url fix some issues you may have when accessing Nextcloud with https. filelocking.enabled enables file locking to prevent issues when multiple clients are accessing/updating the same file; I've read some reports saying that this is not needed if you use some object storage service as primary storage, so you may want to try the false setting too. I have it enabled and have had no issues with it though. loglevel is set to 2 which prevents things like deprecation warnings from filling the main log. And there can be several deprecations when apps get updated. The final setting, enable_previews is set to false because of performance. By default, when you open a folder with many photos either in the web interface of Nextcloud or with a mobile client, Nextcloud will make a lot of HTTP requests to the server to load previews/thumbnails; this can cause a very high load on the server (or in the container) if the folder contains a lot of files, so I prefer disabling previews altogether after having had some issues with that. Another option is to install the preview generator app which allows you to generate the previews in advance, so the HTTP requests are a lot faster than when Nextcloud has to generate the previews on the fly. This is much better, but it can still cause high load with many files.
redis configuration
The next custom config file is for caching with Redis, which can improve performance a lot.
redis.config.php: |-
<?php
$CONFIG = array (
'memcache.local' => '\\OC\\Memcache\\Redis',
'memcache.distributed' => '\OC\Memcache\Redis',
'memcache.locking' => '\OC\Memcache\Redis',
'redis' => array(
'host' => getenv('REDIS_HOST'),
'port' => getenv('REDIS_HOST_PORT') ?: 6379,
'password' => getenv('REDIS_HOST_PASSWORD')
)
);
This overrides the default config so that it takes the Redis password into account; by default the Helm chart installs Redis without password, but I've had authentication issues because of that. The Redis settings are available as environment variables set with the Redis deployment. As a side note, Redis will be installed in a master-slave configuration with two slaves.
S3 configuration
The last custom config file is for the S3 configuration, which in my case is for Wasabi.
s3.config.php: |-
<?php
$CONFIG = array (
'objectstore' => array(
'class' => '\\OC\\Files\\ObjectStore\\S3',
'arguments' => array(
'bucket' => 'bucket-name',
'autocreate' => true,
'key' => 's3-access-key',
'secret' => 's3-secret-key',
'region' => 's3-region',
'hostname' => 's3-endpoint',
'use_ssl' => true
)
)
);
Pretty straightforward, you just need to specify the name of the bucket and the credentials.
default configs
Because we are overriding the Redis config due to the issue with authentication, we need to disable the default config file that is otherwise created for Redis:
defaultConfigs:
.htaccess: true
apache-pretty-urls.config.php: true
apcu.config.php: true
apps.config.php: true
autoconfig.php: false
redis.config.php: false
smtp.config.php: true
hostname and admin user
Not sure why, but the hostname used to access Nextcloud is specified separately from the ingress settings. You also need to configure username and password for the first admin user:
host: nextcloud.domain.com
password: admin-password
username: admin-user
email settings
I am currently not using this, but you can optionally configure an SMTP service so that Nextcloud can send email notifications:
mail:
domain: domain.com
enabled: false
fromAddress: user
smtp:
authtype: LOGIN
host: domain.com
name: user
password: pass
port: 465
secure: ssl
persistence
This config is for the persistent volume used by Nextcloud itself. The user data will be stored in S3 but Nextcloud still needs to store some other data, and for that a persistent volume is created:
persistence:
accessMode: ReadWriteOnce
annotations: {}
enabled: true
size: 8Gi
redis deployment
As mentioned earlier, we need to ensure that the Redis deployment is configured with a password, otherwise you will likely hit issues where Nextcloud cannot communicate with Redis due to failed auth. So we need to set a password because of that:
redis:
enabled: true
password: 'redis-password'
usePassword: true
replica count
If you, like me, want to keep things simple and disable HPA so that you can use a standard ReadWriteOnce volume for Nextcloud, make sure that the deployment is configured with a single replica:
replicaCount: 1
Installation
Finally, to install Nextcloud together with MariaDB and Redis, add the Nextcloud repo to Helm and install the chart:
helm repo add nextcloud https://nextcloud.github.io/helm/
helm repo update
kubectl create ns nextcloud
helm upgrade --install --namespace nextcloud -f your-values.yaml nextcloud nextcloud/nextcloud
As soon as the certificate for the domain is ready, you should be able to access Nextcloud with your admin credentials.
Conclusions
It's pretty easy to install Nextcloud in your Kubernetes cluster, but like I said there are some issues with the default configuration provided by the Helm chart. Hopefully the configuration that I shared will save you some time.