has_many :codes

Self hosted Nextcloud in Kubernetes with S3 as primary storage

Published  

Introduction

Various services exist providing cloud storage, file syncing/sharing and collaboration, with the most popular ones being Dropbox, Google Drive and Microsoft OneDrive/Office 365. Dropbox has some additional features but the core focus is file syncing and sharing. The other two offer many more features, including realtime collaborative document editing and videoconferencing. These services are inexpensive but there are some implications concerning privacy when using them.

Nextcloud is one of the most popular (if not the most popular) self hosted/on-premises platforms for file sharing, collaboration and more. Unlike with the aforementioned centralized cloud storage services, your data remains under your control and you can access it and sync it with a variety of devices through a web interface as well as desktop and mobile clients. It's open source and totally free to use (with optional paid support and an enterprise edition if you need it), and very rich in features, especially with the many add-ons (or "apps") available that extend the platform with custom functionality.

I particularly love the integration with OnlyOffice and Collabora Online, which makes Nextcloud a good self hosted alternative to Google Drive and Office 365 for realtime, collaborative document editing. I use OnlyOffice because it's lighter, with a better interface and much better compatibility with Microsoft's file formats. Other commonly used features are calendars, contacts, and the Talk app that adds instant messaging and video conferencing to the platform. But really there's a ton of features you can get by installing the many apps available. One feature that many people would like and that is missing is email hosting; Nextcloud offers a Mail app but that's just a client, so you will need to use something else to host your email.

Nextcloud is a fork of another open source project called OwnCloud and written in PHP, but it has been evolving on its own for a while now and has become more popular than OwnCloud itself. It focuses on privacy of your data (which is especially important in GDPR-land) and ease of use, as well as enterprise grade security and control.

Several managed hosting services for Nextcloud exist - I was previously using Hetzner's "Storage Share" service - but most people will likely prefer hosting it themselves, and that's what I am doing now too.

There are various ways to install Nextcloud on your servers, but in this brief post we'll see how to deploy it in Kubernetes using the official Helm chart. In most cases, Nextcloud is configured to use local or network based block storage to store the data, but I prefer using an S3 compatible object storage service such as Wasabi. As one might expect, there is a performance hit compared to block storage if you have a lot of data to sync, but it is usually noticeable only when you sync all the data the first time. Subsequent incremental sync is usually reasonably fast. The advantage with an object storage service is that your storage scales with your data seamlessly, so you don't have to increase storage capacity (like resizing persistent volumes in Kubernetes) as the amount of data grows. Object storage is also typically cheaper than block storage.

Deployment in Kubernetes

The reason why I am writing this post is that the Helm chart is confusing, and it took me some trial and error to get a fully working deployment. Some things from the reference values.yaml file are missing, creating confusion and leading to errors and things that don't work properly or not at all.

So in this post I'm basically just sharing what I changed from the default values.yaml in the repo, so to save you some time.

cronjob

Nextcloud needs to perform various maintenance tasks regularly (see this page for details), and that's done with a task scheduled with cron. The Helm chart creates a Kubernetes cronjob for this, so you need to make sure you enable it with a schedule so that the jobs run frequently, say every 5 minutes:

cronjob:
  annotations: {}
  curlInsecure: false
  enabled: true
  failedJobsHistoryLimit: 5
  image: {}
  schedule: '*/5 * * * *'
  successfulJobsHistoryLimit: 2

horizontal pod autoscaler (HPA)

By default, the Helm chart configures an HPA so to scale pods for the main Nextcloud deployment as the load increases. That's a problem if you use a ReadWriteOnce persistent volume for Nextcloud, since that volume cannot be shared by multiple pods. So if you want to be able to scale Nextcloud you will have to use some storage that supports the ReadWriteMany access mode. To keep things simple, and because I only need my Nextcloud instance for a few users, I prefer disabling the HPA and keep just one pod running. Your mileage may vary. To disable the HPA change the settings as follows:

hpa:
  cputhreshold: 60
  enabled: false
  maxPods: 10
  minPods: 1

image

At the time of this writing, the latest version Helm chart still installs Nextcloud 19, while the latest release is 20.0.7. To install the latest release you can override the image tag:

image:
  pullPolicy: IfNotPresent
  repository: nextcloud
  tag: 20.0.7-apache

ingress

In most cases, you will want to access your Nextcloud instance with an ingress resource, with an SSL/TLS certificate issued with Let's Encrypt using cert-manager (which is the most common configuration). You'll need to add a couple of annotations and the TLS settings for that:

ingress:
  annotations:
    cert-manager.io/cluster-issuer: "issuer or cluster issuer"
    kubernetes.io/tls-acme: "true"
  enabled: true
  labels: {}
  tls:
    - hosts:
        - nextcloud.domain.com
      secretName: nextcloud-tls

Make sure you specify the correct name of your cert-manager issuer as well as the hostname you want to use to access Nextcloud. Also make sure you configure a DNS record for that hostname pointing to your cluster before deploying the Helm chart, so that the certificate can be provisioned more quickly.

database

You have three options for the database used by Nextcloud: an "internal" database powered by Sqlite, MariaDB or PostgreSQL. I prefer MariaDB for this so that's what I enable and configure here:

internalDatabase:
  enabled: false
mariadb:
  db:
    name: nextcloud
    password: db-password
    user: nextcloud
  enabled: true
  master:
    persistence:
      accessMode: ReadWriteOnce
      enabled: true
      size: 8Gi
  replication:
    enabled: false
  rootUser:
    password: root-db-password
    forcePassword: true
postgresql:
  enabled: false

metrics

 This is optional if you want to be able to monitor metrics with Prometheus/Grafana:

metrics:
  enabled: true
  https: false
  image:
    pullPolicy: IfNotPresent
    repository: xperimental/nextcloud-exporter
    tag: v0.3.0
  replicaCount: 1
  service:
    annotations:
      prometheus.io/port: '9205'
      prometheus.io/scrape: 'true'
    labels: {}
    type: ClusterIP
  timeout: 5s

Nextcloud config files

Nextcloud uses a main config.php file for its configuration, but with the Helm chart you can use some custom config files to organize the configuration more easily. The first file is custom.config.php and is configured with this yaml:

nextcloud:
  configs:
    custom.config.php: |-
      <?php
      $CONFIG = array (
        'overwriteprotocol' => 'https',
        'overwrite.cli.url' => 'https://nextcloud.domain.com',
        'filelocking.enabled' => 'true',
        'loglevel' => '2',
        'enable_previews' => false
      );

The first two settings overwriteprotocol and overwrite.cli.url fix some issues you may have when accessing Nextcloud with https. filelocking.enabled enables file locking to prevent issues when multiple clients are accessing/updating the same file; I've read some reports saying that this is not needed if you use some object storage service as primary storage, so you may want to try the false setting too. I have it enabled and have had no issues with it though. loglevel is set to 2 which prevents things like deprecation warnings from filling the main log. And there can be several deprecations when apps get updated. The final setting, enable_previews is set to false because of performance. By default, when you open a folder with many photos either in the web interface of Nextcloud or with a mobile client, Nextcloud will make a lot of HTTP requests to the server to load previews/thumbnails; this can cause a very high load on the server (or in the container) if the folder contains a lot of files, so I prefer disabling previews altogether after having had some issues with that. Another option is to install the preview generator app which allows you to generate the previews in advance, so the HTTP requests are a lot faster than when Nextcloud has to generate the previews on the fly. This is much better, but it can still cause high load with many files.

 redis configuration

The next custom config file is for caching with Redis, which can improve performance a lot. 

redis.config.php: |-
      <?php
      $CONFIG = array (
        'memcache.local' => '\\OC\\Memcache\\Redis',
        'memcache.distributed' => '\OC\Memcache\Redis',
        'memcache.locking' => '\OC\Memcache\Redis',
        'redis' => array(
          'host' => getenv('REDIS_HOST'),
          'port' => getenv('REDIS_HOST_PORT') ?: 6379,
          'password' => getenv('REDIS_HOST_PASSWORD')
        )
      );

This overrides the default config so that it takes the Redis password into account; by default the Helm chart installs Redis without password, but I've had authentication issues because of that. The Redis settings are available as environment variables set with the Redis deployment. As a side note, Redis will be installed in a master-slave configuration with two slaves.

S3 configuration

The last custom config file is for the S3 configuration, which in my case is for Wasabi.

s3.config.php: |-
      <?php
      $CONFIG = array (
        'objectstore' => array(
          'class' => '\\OC\\Files\\ObjectStore\\S3',
          'arguments' => array(
            'bucket'     => 'bucket-name',
            'autocreate' => true,
            'key'        => 's3-access-key',
            'secret'     => 's3-secret-key',
            'region'     => 's3-region',
            'hostname'   => 's3-endpoint',
            'use_ssl'    => true
          )
        )
      );

Pretty straightforward, you just need to specify the name of the bucket and the credentials.

default configs

Because we are overriding the Redis config due to the issue with authentication, we need to disable the default config file that is otherwise created for Redis:

defaultConfigs:
    .htaccess: true
    apache-pretty-urls.config.php: true
    apcu.config.php: true
    apps.config.php: true
    autoconfig.php: false
    redis.config.php: false
    smtp.config.php: true

hostname and admin user

Not sure why, but the hostname used to access Nextcloud is specified separately from the ingress settings. You also need to configure username and password for the first admin user:

host: nextcloud.domain.com
password: admin-password
username: admin-user

email settings

I am currently not using this, but you can optionally configure an SMTP service so that Nextcloud can send email notifications:

  mail:
    domain: domain.com
    enabled: false
    fromAddress: user
    smtp:
      authtype: LOGIN
      host: domain.com
      name: user
      password: pass
      port: 465
      secure: ssl

persistence

This config is for the persistent volume used by Nextcloud itself. The user data will be stored in S3 but Nextcloud still needs to store some other data, and for that a persistent volume is created:

persistence:
  accessMode: ReadWriteOnce
  annotations: {}
  enabled: true
  size: 8Gi

redis deployment

As mentioned earlier, we need to ensure that the Redis deployment is configured with a password, otherwise you will likely hit issues where Nextcloud cannot communicate with Redis due to failed auth. So we need to set a password because of that:

redis:
  enabled: true
  password: 'redis-password'
  usePassword: true

replica count

If you, like me, want to keep things simple and disable HPA so that you can use a standard ReadWriteOnce volume for Nextcloud, make sure that the deployment is configured with a single replica:

replicaCount: 1

Installation

Finally, to install Nextcloud together with MariaDB and Redis, add the Nextcloud repo to Helm and install the chart:

helm repo add nextcloud https://nextcloud.github.io/helm/
helm repo update

kubectl create ns nextcloud

helm upgrade --install --namespace nextcloud -f your-values.yaml nextcloud nextcloud/nextcloud

As soon as the certificate for the domain is ready, you should be able to access Nextcloud with your admin credentials.

Conclusions

It's pretty easy to install Nextcloud in your Kubernetes cluster, but like I said there are some issues with the default configuration provided by the Helm chart. Hopefully the configuration that I shared will save you some time.

© Vito Botta