Velero - Cluster backups and disaster recovery
Velero is an open source tool to safely backup and restore, perform disaster recovery, and migrate Kubernetes cluster resources and persistent volumes.
Velero consists of:
- A server that runs on your cluster
- A command-line client that runs locally
Velero CLI Install:
brew install velero
Velero is installed on all clusters as part of the cloud-platform-components install. The module is installed by this section of the code.
Velero is installed in namespace:velero
using helm. The chart used can be found here
You can verify the Velero server pod is installed and running using the kubectl get pods command:
kubectl get pods -n velero
All backups are are stored in an S3 bucket for 30 days. To view the backup location, run the following command:
velero get backup-locations
The output will show the bucket name and corresponding prefix.
Backups
Why?
There are hundreds of reasons why you need to backup parts and/or your full cluster, a few high level reasons are:
- The cluster can not be recovered back to a fully working state after a failed upgrade
- A user accidentally deletes a resource or namespace
- Losing or corrupted persistent storage (EBS Volume)
What?
Velero can be used to back up as little as a single resource type to the whole cluster (all namespaces and resources).
StatefulSets, including corresponding persistent volumes, are backed up together, unlike stateless applications, it is not possible to easily get an application restored when it has persistent data.
Velero does not backup the Kubernetes state stored in etcd. It is highly recommended that a specific backup solution is used for etcd and its relevant certificates in addition to Velero.
How?
It is possible to create manual backups as well as scheduled ones.
Manual backups
velero backup create my-app-backup01 --include-namespaces home
The above command will create a backup called my-app-backup01
and will backup all resources within the home
namespace.
If you accidentally deleted namespace:home
, you can recover the namespace and all resources by running the following command:
velero restore create --from-backup my-app-backup01
Scheduled backups
Currently there is one scheduled backup created as part of the default install.
This schedule is called velero-allnamespacebackup
. This backup includes all namespaces, resources and persistent volumes (EBS) associated with StatefulSets.
The Schedule is set to run every 3 hours.
To view all stored backups within the cluster, run the following command:
velero get backups
It is also possible to restore (and exclude) specific namespaces and resources from a velero-allnamespace backup. Below are examples of restoring a specific namespace and resource:
velero restore create --from-backup velero-allnamespacebackup --include-namespaces monitoring
velero restore create --from-backup velero-allnamespacebackup --include-resource prometheusrules
In a disaster recovery scenario, scheduled backups continue to backup, this may create backups with incorrect config. Instead of stopping individual scheduled jobs, it is recommended to change the storage location access mode to ReadOnly
. To do this, you require the storage location name. The following command will give you the storage location name and current access mode:
velero get backup-locations
Then running the below:
kubectl patch backupstoragelocation <STORAGE LOCATION NAME> \
--namespace velero \
--type merge \
--patch '{"spec":{"accessMode":"ReadOnly"}}'
Once the disaster recovery scenario is resolved, you can set storage location mode back to ReadWrite
using the following command:
kubectl patch backupstoragelocation <STORAGE LOCATION NAME> \
--namespace velero \
--type merge \
--patch '{"spec":{"accessMode":"ReadWrite"}}'
Point Cluster B to to backup folder of Cluster A
It is possible to restore resources from backups from another cluster. You need to add a new Backup Storage Location and Volume Snapshot Location within the primary cluster pointing it to the backup cluster folder you want to restore from.
velero backup-location create <location-name> --provider aws --bucket <bucket-name> --prefix <clusterprefix> --access-mode=ReadOnly
velero snapshot-location create <location-name> --provider aws
At this point, when you execute velero backups get
you will see 2 different locations under the storage location column.
If you want to create normal backups with 2 different storage locations configured, you either have to set one of the locatons as default or add --volume-snapshot-locations
to your backup command.
velero backup create <backup-name> --volume-snapshot-locations <location-name>
Click here for the official Velero documentation.