Postgres on Kubernetes with the Zalando operator


So after quite a bit of testing I think I’m going to use the Zalando operator, which AFAIK has been used in production by Zalando for a couple of years already with 100s of clusters and databases. So in theory it should be reliable. While testing I found some issues though because I am using Scaleway object storage for backups instead of AWS S3 (Scaleway is S3 compatible), and I had to figure out some workarounds by looking at the code because the documentation is lacking with regards to some configuration details. A Reddit user helped with some of this.
Installation
configKubernetes: enable_pod_antiaffinity: true pod_environment_configmap: "postgres-pod-config" watched_namespace: "*" configAwsOrGcp: aws_region: "<s3 region>" aws_endpoint: "<s3 endpoint url>" wal_s3_bucket: "<s3 bucket>" configLogicalBackup: logical_backup_s3_access_key_id: "<s3 access key id>" logical_backup_s3_bucket: "<s3 bucket>" logical_backup_s3_endpoint: "<s3 endpoint url>" logical_backup_s3_secret_access_key: "<s3 secret access key>" logical_backup_s3_sse: "<server side encryption method, e.g. AES256; I leave this blank since it's not supported by Scaleway>" logical_backup_schedule: "<backup schedule in cron format>"
pod_environment_configmap specifies the name of a config map where we define some settings for the cluster to enable WAL archivation to object storage, and cloning/point in time recovery. watched_namespace is set to an empty string by default, which means that the operator will only watch its namespace for the custom resources required to create the clusters. Setting that to * makes it possible to create clusters in different namespaces, which is likely what you want. I am not 100% sure whether the configAwsOrGcp section is actually needed when using s3 compatible storage. The logical backups settings should be self explanatory. Please note that there currently is a missing setting for the region, so I have submitted a PR to add the setting logical_backup_s3_region since this is required with some s3 compatible services like Scaleway, otherwise the upload of logical backups fails. Hopefully it will be merged soon. In the meantime I have build a custom image for the logical backups using the relevant Dockerfile in the repo, and adding the --region parameter to the aws CLI command in dump.sh.
Once the configuration is ready, we can install the operator with Helm:
kubectl create ns postgres-operator helm upgrade --install \ --namespace postgres-operator \ -f ./charts/postgres-operator/values-crd.yaml \ postgres-operator \ ./charts/postgres-operator
I am assuming Helm 3 here so you need to create the namespace first (I use postgres-operator).
PgAdmin
helm upgrade --install \ --namespace postgres-operator \ --set env.email=<your email used as username for login> \ --set env.password=<password> \ --set persistence.enabled=true \ --set persistence.size=1Gi \ --set service.type=ClusterIP \ --set ingress.enabled=false \ pgadmin \ stable/pgadmin
You can optionally enable an ingress, but I prefer not to expose PgAdmin to the web and open it with a port-forward instead:
kubectl -n postgres-operator port-forward svc/pgadmin <some local port>:80
Creating a cluster
kubectl create ns postgres-cluster kubectl apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: postgres-pod-config # The name must be same as configured in the values file earlier namespace: postgres-cluster data: BACKUP_SCHEDULE: "0 */12 * * *" # Schedule a base backup every 12 hours; you can customise as you wish USE_WALG_BACKUP: "true" # Use the golang backup tool (faster) BACKUP_NUM_TO_RETAIN: "14" # For 2 backups per day, keep 7 days of base backups AWS_ACCESS_KEY_ID: <s3 access key> AWS_SECRET_ACCESS_KEY: <s3 secret key> AWS_ENDPOINT: <s3 endpoint url> AWS_REGION: <s3 region> WALG_DISABLE_S3_SSE: "true" # I disable this because it's not supported by Scaleway EOF
Next, edit manifests/complete-postgres-manifest.yaml as follows:
metadata: name: postgres-cluster namespace: postgres-cluster spec: teamId: "postgres" volume: size: <volume size> storageClass: <storage class> numberOfInstances: 2 enableShmVolume: true postgresql: parameters: max_connections: "250" enableLogicalBackup: true logicalBackupSchedule: "00 05 * * *"
The teamId must be set to the first word of the name of the cluster if I remember/understood correctly. I specify at least 2 instances so to have a replica for HA. By default the cluster allows for 10 connections, so I bump this since I usually need more than that. I think you can use fewer connections using something like pgbouncer, but I’m not familiar with it and I am not sure if I would actually need it since I can just increase the number of connections in the cluster directly. Of course I enable the logical backups as well; I would likely use the point in time recovery if needed but it’s nice to have daily dumps too. In the manifest comment out the section about init containers because you will likely not need it. I also comment out the sections about users and databases since I create them once I have access to the cluster, but you can also let the operator create these if you prefer.
Finally, to create the cluster apply the manifest:
kubectl apply -f manifests/complete-postgres-manifest.yaml
This will usually take a few minutes. Once ready, to access the cluster with a client or from an app you can retrieve the password for the postgres user with the following command:
kubectl -n postgres-cluster get secret postgres.postgres-cluster.credentials.postgresql.acid.zalan.do -o 'jsonpath={.data.password}' | base64 -d
Of course change the namespace if needed. That’s it. You now have a working Postgres cluster. You can create as many as you wish and the operator will take care of managing them.
Logical backups
kubectl -n postgres-cluster create job backup-name --from=cronjob/logical-backup-postgres-cluster
You should see that a path like /spilo///logical-backups has been created in the s3 bucket and it includes a compressed dump.
Restoring a logical backup
apt update apt install -y python3-pip postgresql-client pip3 install s3cmd cat > ~/.pgpass <<EOF postgres-cluster:5432:postgres:postgres:<password for the postgres user in the target cluster> EOF chmod 0600 /root/.pgpass DUMP_URL=s3://..... (URL of the .sql.gz dump) /usr/local/bin/s3cmd \ --no-check-certificate \ --access_key=<s3 access key> \ --secret_key=<s3 secret key> \ --host=<s3 endpoint url> \ --host-bucket="<s3 host bucket - see s3cmd reference for your object storage service>" \ get --no-progress ${DUMP_URL} - | gunzip | psql -U postgres -h postgres-cluster
The hostname for the import with psql must be the name of the target cluster. This process will take some time depending on the size of the backup of course.
Point in time recovery with WAL
For point in time recovery, you need to create a cluster almost the same way as we created a new cluster earlier. The only difference is that we need to add a section to the config map which specifies where the WAL archive is located in s3, and the date/time we want to restore. So, like we did for a brand new cluster, you need to create the namespace first and then the config map as follows:
kubectl create ns postgres-cluster kubectl apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: postgres-pod-config # The name must be same as configured in the values file earlier namespace: postgres-cluster data: BACKUP_SCHEDULE: "0 */12 * * *" # Schedule a base backup every 12 hours; you can customise as you wish USE_WALG_BACKUP: "true" # Use the golang backup tool (faster) BACKUP_NUM_TO_RETAIN: "14" # For 2 backups per day, keep 7 days of base backups AWS_ACCESS_KEY_ID: <s3 access key> AWS_SECRET_ACCESS_KEY: <s3 secret key> AWS_ENDPOINT: <s3 endpoint url> AWS_REGION: <s3 region> WALG_DISABLE_S3_SSE: "true" # I disable this because it's not supported by Scaleway # Optional for point in time recovery/restore USEWALG_RESTORE: "true" CLONE_METHOD: CLONE_WITH_WALE CLONE_AWS_ACCESS_KEY_ID: <s3 access key> CLONE_AWS_SECRET_ACCESS_KEY: <s3 secret key> CLONE_AWS_ENDPOINT: <s3 endpoint url> CLONE_AWS_REGION: <s3 region> CLONE_WAL_S3_BUCKET: <s3 bucket> CLONE_WAL_BUCKET_SCOPE_SUFFIX: "/<id of the original cluster>" CLONE_TARGET_TIME: "2020-02-04T12:49:03+00:00" # time in the WAL archive that we want to restore CLONE_SCOPE: postgres-cluster # name of the original cluster EOF
Pay attention to the CLONE_WAL_BUCKET_SCOPE_SUFFIX setting, which should be a string including the ID of the original cluster prefixed with a “/”. Finally, to restore the cluster just edit and apply the manifest like we did for a brand new cluster.