Comprehensive Guide to Setting Up Velero for Backup and Restore on EKS
Velero is a powerful tool for managing disaster recovery in Kubernetes environments. It allows you to back up, restore, and migrate Kubernetes resources and persistent volumes. This blog walks you through the process of setting up Velero, creating backups, and restoring applications step-by-step.
Prerequisites
Step 1: Prerequisites
Before starting, ensure you have:
An active AWS account.
AWS CLI installed and configured (
aws configure
)kubectl
installed and configured to access your EKS cluster.helm
installed for managing Helm charts.
Step 2: Create an S3 Bucket
Velero uses an S3 bucket to store backups.
Run the following command:
aws s3 mb s3://<your-unique-bucket-name> --region <your-region>
Replace
<your-unique-bucket-name>
with a globally unique bucket name (e.g.,velero-backups-niranjan1234
).Replace
<your-region>
with your desired AWS region (e.g.,ap-southeast-1
).
Navigate to S3:
- Click on the S3 service from the dashboard or search for it in the search bar.
- Locate Your Bucket:
Look for the bucket name you provided (e.g.,
velero-backups-niranjan1234
) in the list of buckets.Ensure it's in the specified region (e.g.,
ap-southeast-1
).
Step 3: Create the IAM Policy for Velero
The IAM policy ensures Velero that grants it the required permissions to interact with EC2 (for snapshots) and S3 (for backup storage).
Steps:
- Create a file named
velero-policy.json
:
vi velero-policy.json
- Paste the following policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeVolumes",
"ec2:DescribeSnapshots",
"ec2:CreateTags",
"ec2:CreateVolume",
"ec2:CreateSnapshot",
"ec2:DeleteSnapshot"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:DeleteObject",
"s3:PutObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Resource": [
"arn:aws:s3:::velero-backups-niranjan/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::velero-backups-niranjan"
]
}
]
}
Replace
<your-unique-bucket-name>
with your bucket name.Save the file and create the policy:
aws iam create-policy \
--policy-name VeleroAccessPolicy \
--policy-document file://velero-policy.json
when you create the IAM policy using the velero-policy.json
file, it will be created in AWS, and you can view it in the IAM Policies section of the AWS Management Console.
Under the Permissions tab, verify that the policy contains the correct EC2 and S3 actions.
Step 4: Create the Trust Policy
Create the trust policy
The Trust Policy is essential for allowing the Kubernetes service account (like
velero-server
) to assume the IAM RoleTrust Policy enables Velero's Kubernetes service account to assume the IAM Role.
Without the Trust Policy: Velero might still work with broader permissions, but it won't be as secure or granular.
The Trust Policy provides
secure integration
between Kubernetes and AWS IAM, following the principle of least privilege.
vi trust-policy.json
Paste the following content:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::779846791968:oidc-provider/oidc.eks.ap-southeast-1.amazonaws.com/id/8F9F1AD9A9847337AB6367606E047C63"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.ap-southeast-1.amazonaws.com/id/8F9F1AD9A9847337AB6367606E047C63:sub": "system:serviceaccount:velero:velero-server"
}
}
}
]
}
Replace:
Replace the OIDC provider ARN (
arn:aws:iam::779846791968:oidc-provider/oidc.eks.ap-southeast-1.amazonaws.com/id/8F9F1AD9A9847337AB6367606E047C63
) with your cluster’s specific OIDC provider ARN.<your-aws-account-id>
with your AWS account ID.<eks-cluster-id>
with your EKS cluster ID.
Step 5: Create the IAM Role with OIDC Trust for Velero Backup
aws iam create-role \
--role-name eks-velero-backup \
--assume-role-policy-document file://trust-policy.json
Explanation: This command creates an IAM role named eks-velero-backup
and associates it with the specified trust policy (trust-policy.json
). The trust policy allows the Velero service account to securely assume this role via the OpenID Connect (OIDC) provider linked to your EKS cluster.
- The Trust Policy, which allows Kubernetes to assume this role using the service account.
Step 6: Attach the Policy to the IAM Role
Attach the VeleroAccessPolicy
to the role:
Attaching the policy ensures that the IAM role can perform Velero-related actions.
Steps:
- Attach the policy:
aws iam attach-role-policy \
--role-name eks-velero-backup \
--policy-arn arn:aws:iam::<your-aws-account-id>:policy/VeleroAccessPolicy
Step 7: Enable OIDC for EKS
OIDC is required to securely associate the IAM role with the Kubernetes service account.
Steps:
Enable OIDC for your EKS cluster:
Kubernetes service accounts, like
velero-server
, can assume this role to perform Velero-related actions, such as accessing S3 and managing snapshots.
eksctl utils associate-iam-oidc-provider \
--region <your-region> \
--cluster <your-cluster-name> \
--approve
Comparison: Service Account Without OIDC and With OIDC
Feature | Without OIDC | With OIDC |
Access Keys | Not required. AWS credentials are dynamic. | Not required. AWS credentials are dynamic. |
Security | Scoped at the EC2 instance level. | Scoped at the service account level. |
Management | IAM role attached to EC2 instances. | IAM role tied to specific service accounts. |
Best Practice Alignment | Secure, but less granular. | Fully secure and granular. |
Step 8: Create a Kubernetes Service Account and Link It to the IAM Role.
1. Create the Service Account.
Run the following kubectl
command to create the service account in the velero
namespace:
kubectl create serviceaccount velero-server -n velero
2. Annotate the Service Account with the IAM Role
Attach the IAM role (eks-velero-backup
) to the service account by adding an annotation:
kubectl annotate serviceaccount velero-server \
eks.amazonaws.com/role-arn=arn:aws:iam::779846791968:role/eks-velero-backup \
-n velero
Linking the Service Account to an IAM Role:
The Service Account is annotated with the IAM Role (eks-velero-backup
),
which grants access to the S3 bucket for storing backups
, EC2 for managing EBS volume snapshots
.
Verify the Service Account
Check if the service account exists and has the correct annotation:
kubectl get serviceaccount velero-server -n velero -o yaml
Step 9: Install Velero Using Helm
Helm simplifies the deployment and management of Velero in your Kubernetes cluster.
Steps:
- Add the Velero Helm repository:
helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts
helm repo update
- Create a
values.yaml
file:
vi values.yaml
- Paste the following configuration:
configuration:
backupStorageLocation:
- bucket: <your-unique-bucket-name>
provider: aws
volumeSnapshotLocation:
- config:
region: <your-region>
provider: aws
initContainers:
- name: velero-plugin-for-aws
image: velero/velero-plugin-for-aws:v1.7.1
volumeMounts:
- mountPath: /target
name: plugins
credentials:
useSecret: false
serviceAccount:
server:
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::<your-aws-account-id>:role/eks-velero-backup"
- Replace placeholders with your values.
Install Velero:
helm install velero vmware-tanzu/velero -f values.yaml --namespace velero
Step 10: Verify Velero Setup
Step 11: Download the Velero CLI
- Visit the Velero GitHub releases page or use the following command to download the latest release (replace the version if necessary):
curl -LO https://github.com/vmware-tanzu/velero/releases/download/v1.15.0/velero-v1.15.0-linux-amd64.tar.gz
This will download the tar.gz
file containing the Velero CLI.
- Extract the downloaded file:
tar -xvf velero-v1.15.0-linux-amd64.tar.gz
This will extract the Velero binary (velero
) into the current directory.
- Move the
velero
binary to a directory included in your system's$PATH
, such as/usr/local/bin
:
sudo mv velero-v1.15.0-linux-amd64/velero /usr/local/bin/
- Ensure the
velero
binary is executable:
sudo chmod +x /usr/local/bin/velero
- Verify that Velero is correctly installed by running:
velero version
This should display the Velero version and confirm that the CLI is working.
Step:12 Deploy the Application Resources
Create a YAML file (spring-mongo.yaml
) containing the deployment, service, PVC, PV, secrets, and config maps for your application.
First, we’re going to deploy our Spring Boot application along with MongoDB. These are the resources we’ll be backing up and restoring later. Let’s apply the YAML file and check if everything is up and running."
Commands:
kubectl apply -f spring-mongo.yaml
Verify Deployed Resources
Verify all resources are created successfully:
kubectl get all kubectl get pvc kubectl get pv kubectl get cm kubectl get secret
Notice that our application is live, and the services are exposed properly.
Access the application via the LoadBalancer External IP:
Visit the External IP of the
springapp
service in your browser.
Step 13: Add a User to Validate Data
Open the application in your browser and register a new user.
Example: Niranjanreddy Kandlagunta with the emailniranjanreddy123@gmail.com.
Verify that the user appears under Saved Users in the UI.
The MongoDB database is persisting the data properly.
Step 14:Run Backup Commands
After installing the Velero CLI, you should be able to run:
To create a backup:
velero backup create mybackup
It will back up all namespaces and all resources in the cluster. But it won't back up the CRDs unless you explicitly tell Velero to include them.
To include CRDs in your backup, you need to run:
velero backup create mybackup --include-resources=customresourcedefinitions.apiextensions.k8s.io
Verify Backup Details and Resources Count
Once the backup is completed, it's important to verify the details and understand what has been backed up. Velero provides a clear summary of the resources and volumes included in the backup.
velero backup describe mybackup
Key Observations:
Phase: The backup phase should indicate Completed, confirming a successful backup.
Total Items to Be Backed Up: Displays the total number of resources included in the backup (
e.g., 439 in this case
).Backup Volumes: Includes information about persistent volumes associated with your cluster.
From the example provided:
Started: The timestamp when the backup started.
Completed: The timestamp when the backup finished.
Resources: A total of 439 resources were successfully backed up, ensuring comprehensive coverage
of all configurations and data.
Step 15: Verify Backup Storage in S3 Bucket
Before simulating a disaster, it's crucial to verify that the backup is successfully stored in your S3 bucket. This ensures that the data is safe and can be restored if needed.
Steps to Check S3 Bucket Backup:
Locate the S3 Bucket:
Find the bucket name you specified during Velero configuration. For example, if you named your bucketvelero-backups-niranjan
, locate it in the bucket list.Navigate to the Backup Folder:
Inside the bucket, check for a folder namedbackups
or the name associated with your Velero
In the S3 bucket, after running the velero backup create mybackup
command, the structure will look like this:
This ensures that everything required to restore the cluster is safely stored in the S3 bucket, making it easy to recover from a disaster.
/velero-backups/ # Main bucket
└── backups/
└── mybackup/ # Folder for the specific backup
├── resources/ # Kubernetes manifests
├── volume-snapshots/ # Persistent Volume data
├── metadata.json # Metadata file
└── backup.log # Logs of the backup operation
Step 16: Simulate a Disaster (Delete All Resources)
Now that you’ve verified the backup is securely stored in S3, let’s simulate a disaster scenario to validate the restoration process.
Command to Delete All Resources:
kubectl delete all --all
What This Does:
Deletes all Kubernetes resources in the current namespace.
Simulates a disaster where all cluster data is lost.
Verify that all resources are deleted:
kubectl get pods
Step 17: Restore the Application
- Restore the backup taken earlier using Velero.
velero restore create --from-backup mybackup
Validate the Restoration
- Ensure all Kubernetes resources are restored:
kubectl get all
- Access the application again via the LoadBalancer External IP.
- Confirm that the user (
Niranjan Kandlagunta
) is still present under Saved Users in the UI.
After restoring with Velero, it creates a restore log file in the S3 bucket under the restore/
folder. This file contains details like the restore name, backup used, time, status, and any warnings or skipped resources.
This file contains details like the restore name, backup used, time, status, and any warnings or skipped resources.
Automating Backups with Cron Jobs (Optional)
After learning how to manually back up and restore Kubernetes resources with Velero, you can automate the process using cron jobs. This allows you to schedule backups at specific intervals, ensuring regular and automated data protection, especially in production environments.
Steps to Create Scheduled Backups with Velero Cron Jobs:
Create a Scheduled Backup for Specific Namespace
If you want to create a scheduled backup for a specific namespace (e.g.,
my-namespace
), you can use the following command. This example will run the backup every day at midnight (00:00):velero schedule create mybackup-schedule --schedule="0 0 * * *" --include-namespaces=my-namespace
Explanation:
mybackup-schedule
: The name of the backup schedule.--schedule="0 0 * * *"
: The cron expression specifying when the backup will run. In this case, it will run at midnight every day.--include-namespaces=my-namespace
: Specifies that only the resources inmy-namespace
will be backed up.
Create a Scheduled Backup for All Namespaces
If you want to back up all namespaces in the cluster, you can modify the command like this:
velero schedule create mybackup-schedule --schedule="0 0 * * *" --include-namespaces=all
This command will back up all namespaces at midnight every day.
Create a Scheduled Backup for CRDs Only
In case you only want to back up Custom Resource Definitions (CRDs), use the following command. This example will back up the CRDs every day at 2 AM:
velero schedule create crd-backup-schedule --schedule="0 2 * * *" --include-resources=customresourcedefinitions.apiextensions.k8s.io
Explanation:
crd-backup-schedule
: The name of the scheduled CRD backup.--schedule="0 2 * * *"
: The cron expression specifying that the backup will run at 2 AM every day.--include-resources=customresourcedefinitions.apiextensions.k8s.io
: Specifies that only CRDs will be backed up.
List All Scheduled Backups
You can view the details of all scheduled backups with the following command:
velero schedule get
This will list all backup schedules, including their names, the cron schedule, and other relevant details.
Delete a Scheduled Backup
If you no longer need a scheduled backup, you can delete it with this command:
velero schedule delete mybackup-schedule
Conclusion
Velero allows you to automate backups using cron jobs, ensuring your Kubernetes data and all cluster resources are regularly protected without manual effort. You can schedule backups for specific namespaces, all namespaces, or all resources in the cluster, including CRDs. Additionally, manual backups can be created as needed. Combining both manual and automated backups offers flexibility and ensures the safety and easy restoration of all your cluster's resources.