preloader
  • Home
  • Streamline OpenShift Etcd Backup with Automation

Automatically store etcd backups on a Storage Area Network (SAN)

blog-thumb

This procedure automates the backup of OpenShift’s etcd database.

It will schedule a cronjob and store the files on a NFS server share.

You can easily adapt the job to store backups on other SANs.

Initially, I prepared a playbook for creating etcd backups. It used the etcdctl cli command. But after a while, I realized that it was much more efficient to use OpenShift’s native features.

This setup was validated on OpenShift 4.14, but it is also valid for versions 4.10+.

Requirements:

1] Admin access to OpenShift 4 cluster

2] NFS Server


Project (AKA Namespace)

Create a namespace to allocate the necessary resources.

$ oc new-project ocp-etcd-backup --description "Openshift Backup Automation Tool" --display-name "Backup ETCD Automation"

Service Account

The job will be executed by a service account.

kind: ServiceAccount
apiVersion: v1
metadata:
  name: openshift-backup
  namespace: ocp-etcd-backup
  labels:
    app: openshift-backup

Cluster Role

We need a custom role so that the backup process has access to the necessary resources.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-etcd-backup
rules:
- apiGroups: [""]
  resources:
     - "nodes"
  verbs: ["get", "list"]
- apiGroups: [""]
  resources:
     - "pods"
     - "pods/log"
     - "pods/attach"
  verbs: ["get", "list", "create", "delete", "watch"]
- apiGroups: [""]
  resources:
     - "namespaces"
  verbs: ["get", "list", "create"]

Cluster Role Binding

Associate the custom role to the service account created for the backup job.

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: openshift-backup
  labels:
    app: openshift-backup
subjects:
  - kind: ServiceAccount
    name: openshift-backup
    namespace: ocp-etcd-backup
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-etcd-backup

Security Context Constraints

The service account needs to be privileged in the SCC.

$ oc adm policy add-scc-to-user privileged -z openshift-backup

Backup CronJob

This schedule will run the backups at 01:00 AM.

* Change NFS-SERVER-IP:SHARED-PATH to reflect the details of your NFS server.

kind: CronJob
apiVersion: batch/v1
metadata:
  name: openshift-backup
  namespace: ocp-etcd-backup
  labels:
    app: openshift-backup
spec:
  schedule: "0 1 * * *"
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 5
  failedJobsHistoryLimit: 5
  jobTemplate:
    metadata:
      labels:
        app: openshift-backup
    spec:
      backoffLimit: 0
      template:
        metadata:
          labels:
            app: openshift-backup
        spec:
          containers:
            - name: backup
              image: "registry.redhat.io/openshift4/ose-cli"
              command:
                - "/bin/bash"
                - "-c"
                - oc get no -l node-role.kubernetes.io/master --no-headers -o name | head -n 1 | xargs -I {} -- oc debug {} --to-namespace=ocp-etcd-backup -- bash -c 'chroot /host rm -rf /home/core/backup && chroot /host mkdir -p /home/core/backup && chroot /host sudo -E mount -t nfs <NFS-SERVER-IP>:<SHARED-PATH> /home/core/backup && chroot /host sudo -E /usr/local/bin/cluster-backup.sh /home/core/backup && chroot /host sudo -E find /home/core/backup/ -type f -mmin +"1" -delete'
          restartPolicy: "Never"
          terminationGracePeriodSeconds: 30
          activeDeadlineSeconds: 500
          dnsPolicy: "ClusterFirst"
          serviceAccountName: "openshift-backup"
          serviceAccount: "openshift-backup"

Manually running a backup

Run the backup job manually.

$ oc create job backup --from=cronjob/openshift-backup

If it runs correctly, the result will be similar to the one below.

$ oc get jobs.batch
NAME                        COMPLETIONS   DURATION   AGE
backup                      1/1           24s        18m

The other backups performed can be consulted in the job.

$ oc get jobs.batch
NAME                        COMPLETIONS   DURATION   AGE
backup                      1/1           24s        18m
openshift-backup-28514510   1/1           24s        10m
openshift-backup-28514520   0/1           17s        17s <-- RUNNING BACKUP

To see the result of a specific job, use the following command.

$ oc describe jobs.batch openshift-backup-28514520

To see technical details of what is happening in the cluster when you invoke the job, use the command below.

$ oc get events -n ocp-etcd-backup

Backup PODs:

$ oc get pods -n ocp-etcd-backup
NAME                               READY   STATUS    RESTARTS   AGE
pod/backup-jcn87                   1/1     Running   0          67s
pod/ocp-j8gcn-master-0-debug       0/1     Running   0          6s

Backup pods are ephemeral, and will be deleted at the end of the backup.


Closing Notes

This process provides a basis for automating the backup of OpenShift etcd. Adapt it to your specific needs and configure the options according to your infrastructure.

I hope this post has helped you understand how to automate backups of an etcd database.


Did you like the content? Check out these other interesting articles! 🔥



Could you help?

Please support this content by clicking on one of our advertisers’ banners. ❤️

comments powered by Disqus