Velero as Backup solution for Kubernetes

As I am running more and more stateful applications on my Kubernetes clusters, I saw the need to prioritize securing their data. I am not concerned about the Kubernetes objects as I have defined everything with infrastructure-as-code and can sync/restore everything almost effortlessly if something happens. My main concern is the data stored on PersistentVolumes.

So what backup tool am I going to use? I was looking at Veeam Kasten, Velero and k8up. I ditched Veeam Kasten as this is some commercial product, and it has some subscription-based licensing model. Spending money on that is not an option for my home lab environment. Plus, I prefer OpenSource software in general.

k8up just backups PersistentVolumes and not Kubernetes resources of a namespace or a whole cluster. That would be ok for me as I have everything defined with infrastructure-as-code, so I could restore Kubernetes resources that way. But as I experienced, it's a nice option to have everything restored in one go. k8up only offers the option to do a file-system-based backup using Restic. It also has a rather small community and just 800 stars on GitHub.

Velero offers to back up Kubernetes namespaces or whole Kubernetes clusters. So if disaster strikes I like to have that option. Plus it offers multiple ways to backup PersistentVolumes:

  1. file-system-based backup with Kopia (Restic is phased out)
  2. VolumeSnapshots or VolumeGroupSnapshots

The snapshot option is particularly interesting to me because I run my own Ceph cluster for Kubernetes storage. And the Ceph CSI driver supports VolumeSnapshot and VolumeGroupSnapshot. Velero is also widely used and has a much larger community than k8up. On GitHub it has 9.4k stars. Velero supports the main cloud providers out of the box and also has some community providers.

So I choose to use Velero eventually. In the first place, I decided to go for the file-system-based backup with Velero because doing snapshots with Ceph on my own infrastructure have one disadvantage: if my Ceph cluster breaks down completely, data of the snapshots is also lost. This is very unlikely to happen but there is Murphy's Law. 😀 I will start playing a bit later with the snapshots after I secured my data.

2025_09_14_velero_kubernetes_backup.png

Prerequisites

AWS Resources

As we want to use AWS S3 as storage provider, we need to create:

Creating an S3 storage bucket and user credentials rather straight forward: you can do that using the AWS web UI, the AWS client or via OpenTofu (infrastructure-as-code) which I prefer. The process is also very well documented in the GitHub repo of the AWS plugin for Velero. So I will not go into this further.

Velero CLI

Also, you want to have the Velero CLI tool installed on your machine. The CLI tool comes in handy when you want to test your installation and later for doing backups and restores. It is quickly installed, just have a look at the official documentation.

Velero Installation and Configuration via CLI

Please be aware that Velero CLI tool expects Velero to be installed in velero namespace. You can use other namespaces, but then you need to use --namespace option for every CLI command or set the VELERO_NAMESPACE environment variable to persist this setting. So for the sake of convenience we just create that velero namespace with pod-security.kubernetes.io/enforce set to privileged:

$ kubectl create namespace velero
$ kubectl label namespace velero pod-security.kubernetes.io/enforce=privileged

The next step is to create the Secret containing the credentials for the AWS S3 storage bucket. Prepare a file aws-cred containing your credentials in that format:

[default]
aws_access_key_id=<YOUR_ACCESS_KEY_ID>
aws_secret_access_key=<YOUR_ACCESS_KEY>

Then create a secret from that file:

$ kubectl -n velero create secret generic aws-s3 --from-file=cloud=./aws-cred

Prepare the values.yaml file for the installation with Helm:

configuration:
  backupStorageLocation:
  - name: "aws-s3"
    provider: "aws"
    bucket: "your-storage-bucket-name"
    default: true
    accessMode: ReadWrite
    config:
      region: "your-aws-region"
  defaultVolumesToFsBackup: true
credentials:
  useSecret: true
  existingSecret: aws-s3
initContainers:
- name: velero-plugin-for-aws
  image: velero/velero-plugin-for-aws:v1.12.2
  imagePullPolicy: IfNotPresent
  volumeMounts:
  - mountPath: /target
    name: plugins
snapshotsEnabled: false
deployNodeAgent: true

You will need to adjust the .yaml file to your personal needs:

We need to deploy the Velero node agent when we want to use the file system backup with deployNodeAgent: true. Also I disabled file system snapshots with snapshotsEnabled: false for now as this is not the current focus.

We use Helm to install the application:

$ helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts
$ helm install velero vmware-tanzu/velero --namespace velero --values values.yaml

There is an issue with the current Velero Helm chart (v10.1.2): for applying CRDs an init container with docker.io/bitnamilegacy/kubectl image is spun up. The problem is that the latest version of that image is v1.33.4 and Bitnami guys are no longer supporting this image (for free). So if you see the init container hanging with an imagePullBackOff error, patch that init container image to docker.io/bitnamilegacy/kubectl:1.33.4 so CRDs become installed.

After Helm installation completed check on your installation using the CLI:

$ velero version

Velero installation with ArgoCD (GitOps)

For pursuing the GipOps approach with ArgoCD I am including also the manifest files. Using these would then be a copy and paste job for your git repository. I am leaving open here how you are going to create the secret for the AWS S3 storage bucket as there are a lot of options for GitOps secret handling.

Namespace

apiVersion: v1
kind: Namespace
metadata:
  name: velero
  labels:
    pod-security.kubernetes.io/enforce: privileged
spec: {}

Velero Application

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: velero
  namespace: argocd
  finalizers:
  - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  syncPolicy:
    automated: {}
  destination:
    namespace: velero
    server: https://kubernetes.default.svc
  source:
    chart: velero
    repoURL: https://vmware-tanzu.github.io/helm-charts
    targetRevision: 10.1.2
    helm:
      valuesObject:
        configuration:
          backupStorageLocation:
          - name: "aws-s3"
            provider: "aws"
            bucket: "your-bucket-name"
            default: true
            accessMode: ReadWrite
            config:
              region: "your-aws-region"
          defaultVolumesToFsBackup: true
        credentials:
          useSecret: true
          existingSecret: aws-s3
        initContainers:
        - name: velero-plugin-for-aws
          image: velero/velero-plugin-for-aws:v1.12.2
          imagePullPolicy: IfNotPresent
          volumeMounts:
          - mountPath: /target
            name: plugins
        snapshotsEnabled: false
        deployNodeAgent: true

Using Velero

Backing up and restoring data with Velero CLI is rather straight forward and well documented. So I will not go into details here. With the CLI tool you can quickly back up and restore a namespace to further test your setup:

$ velero create backup test --include-namespace applications
$ velero restore create --from-backupo test

I would like to point out option -o for create command, which comes in handy for creating .yaml files for GitOps. For example, a backup Schedule for applications namespace:

$ velero schedule create test --schedule="* 3 * * *" --include-namespaces applications -o yaml
apiVersion: velero.io/v1
kind: Schedule
metadata:
  creationTimestamp: null
  name: test
  namespace: velero
spec:
  schedule: '* 3 * * *'
  template:
    csiSnapshotTimeout: 0s
    hooks: {}
    includedNamespaces:
    - applications
    itemOperationTimeout: 0s
    metadata: {}
    ttl: 0s
  useOwnerReferencesInBackup: false
status: {}

Instead of sending the object to the server the .yaml is printed to stdout.