Comprehensive Guide to Setting Up Velero for Backup and Restore on EKS

Comprehensive Guide to Setting Up Velero for Backup and Restore on EKS

Velero is a powerful tool for managing disaster recovery in Kubernetes environments. It allows you to back up, restore, and migrate Kubernetes resources and persistent volumes. This blog walks you through the process of setting up Velero, creating backups, and restoring applications step-by-step.


Prerequisites

Step 1: Prerequisites

Before starting, ensure you have:

  • An active AWS account.

  • AWS CLI installed and configured (aws configure)

  • kubectl installed and configured to access your EKS cluster.

  • helm installed for managing Helm charts.


Step 2: Create an S3 Bucket

Velero uses an S3 bucket to store backups.

Run the following command:

aws s3 mb s3://<your-unique-bucket-name> --region <your-region>
  • Replace <your-unique-bucket-name> with a globally unique bucket name (e.g., velero-backups-niranjan1234).

  • Replace <your-region> with your desired AWS region (e.g., ap-southeast-1).

  • Navigate to S3:

    • Click on the S3 service from the dashboard or search for it in the search bar.

  • Locate Your Bucket:

  • Look for the bucket name you provided (e.g., velero-backups-niranjan1234) in the list of buckets.

  • Ensure it's in the specified region (e.g., ap-southeast-1).


Step 3: Create the IAM Policy for Velero

The IAM policy ensures Velero that grants it the required permissions to interact with EC2 (for snapshots) and S3 (for backup storage).

Steps:

  1. Create a file named velero-policy.json:

vi velero-policy.json

  1. Paste the following policy:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeVolumes",
        "ec2:DescribeSnapshots",
        "ec2:CreateTags",
        "ec2:CreateVolume",
        "ec2:CreateSnapshot",
        "ec2:DeleteSnapshot"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:DeleteObject",
        "s3:PutObject",
        "s3:AbortMultipartUpload",
        "s3:ListMultipartUploadParts"
      ],
      "Resource": [
        "arn:aws:s3:::velero-backups-niranjan/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::velero-backups-niranjan"
      ]
    }
  ]
}
  1. Replace <your-unique-bucket-name> with your bucket name.

  2. Save the file and create the policy:

aws iam create-policy \
  --policy-name VeleroAccessPolicy \
  --policy-document file://velero-policy.json

when you create the IAM policy using the velero-policy.json file, it will be created in AWS, and you can view it in the IAM Policies section of the AWS Management Console.

Under the Permissions tab, verify that the policy contains the correct EC2 and S3 actions.


Step 4: Create the Trust Policy

  1. Create the trust policy

    The Trust Policy is essential for allowing the Kubernetes service account (like velero-server) to assume the IAM Role

  2. Trust Policy enables Velero's Kubernetes service account to assume the IAM Role.

  3. Without the Trust Policy: Velero might still work with broader permissions, but it won't be as secure or granular.

  4. The Trust Policy provides secure integration between Kubernetes and AWS IAM, following the principle of least privilege.

vi trust-policy.json

Paste the following content:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::779846791968:oidc-provider/oidc.eks.ap-southeast-1.amazonaws.com/id/8F9F1AD9A9847337AB6367606E047C63"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.ap-southeast-1.amazonaws.com/id/8F9F1AD9A9847337AB6367606E047C63:sub": "system:serviceaccount:velero:velero-server"
        }
      }
    }
  ]
}

Replace:

  • Replace the OIDC provider ARN (arn:aws:iam::779846791968:oidc-provider/oidc.eks.ap-southeast-1.amazonaws.com/id/8F9F1AD9A9847337AB6367606E047C63) with your cluster’s specific OIDC provider ARN.

  • <your-aws-account-id> with your AWS account ID.

  • <eks-cluster-id> with your EKS cluster ID.


Step 5: Create the IAM Role with OIDC Trust for Velero Backup

aws iam create-role \
  --role-name eks-velero-backup \
  --assume-role-policy-document file://trust-policy.json

Explanation: This command creates an IAM role named eks-velero-backup and associates it with the specified trust policy (trust-policy.json). The trust policy allows the Velero service account to securely assume this role via the OpenID Connect (OIDC) provider linked to your EKS cluster.

  • The Trust Policy, which allows Kubernetes to assume this role using the service account.

Step 6: Attach the Policy to the IAM Role

Attach the VeleroAccessPolicy to the role:

Attaching the policy ensures that the IAM role can perform Velero-related actions.

Steps:

  1. Attach the policy:
aws iam attach-role-policy \
  --role-name eks-velero-backup \
  --policy-arn arn:aws:iam::<your-aws-account-id>:policy/VeleroAccessPolicy


Step 7: Enable OIDC for EKS

OIDC is required to securely associate the IAM role with the Kubernetes service account.

Steps:

  1. Enable OIDC for your EKS cluster:

  2. Kubernetes service accounts, like velero-server, can assume this role to perform Velero-related actions, such as accessing S3 and managing snapshots.

eksctl utils associate-iam-oidc-provider \
  --region <your-region> \
  --cluster <your-cluster-name> \
  --approve

Comparison: Service Account Without OIDC and With OIDC

FeatureWithout OIDCWith OIDC
Access KeysNot required. AWS credentials are dynamic.Not required. AWS credentials are dynamic.
SecurityScoped at the EC2 instance level.Scoped at the service account level.
ManagementIAM role attached to EC2 instances.IAM role tied to specific service accounts.
Best Practice AlignmentSecure, but less granular.Fully secure and granular.

1. Create the Service Account.

Run the following kubectl command to create the service account in the velero namespace:

kubectl create serviceaccount velero-server -n velero

2. Annotate the Service Account with the IAM Role

Attach the IAM role (eks-velero-backup) to the service account by adding an annotation:

kubectl annotate serviceaccount velero-server \
  eks.amazonaws.com/role-arn=arn:aws:iam::779846791968:role/eks-velero-backup \
  -n velero

Linking the Service Account to an IAM Role:

The Service Account is annotated with the IAM Role (eks-velero-backup),

which grants access to the S3 bucket for storing backups, EC2 for managing EBS volume snapshots.

Verify the Service Account

Check if the service account exists and has the correct annotation:

kubectl get serviceaccount velero-server -n velero -o yaml

Step 9: Install Velero Using Helm

Helm simplifies the deployment and management of Velero in your Kubernetes cluster.

Steps:

  1. Add the Velero Helm repository:
helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts
helm repo update

  1. Create a values.yaml file:
vi values.yaml

  1. Paste the following configuration:
configuration:
  backupStorageLocation:
    - bucket: <your-unique-bucket-name>
      provider: aws
  volumeSnapshotLocation:
    - config:
        region: <your-region>
      provider: aws

initContainers:
  - name: velero-plugin-for-aws
    image: velero/velero-plugin-for-aws:v1.7.1
    volumeMounts:
      - mountPath: /target
        name: plugins

credentials:
  useSecret: false

serviceAccount:
  server:
    annotations:
      eks.amazonaws.com/role-arn: "arn:aws:iam::<your-aws-account-id>:role/eks-velero-backup"
  1. Replace placeholders with your values.

Install Velero:

helm install velero vmware-tanzu/velero -f values.yaml --namespace velero


Step 10: Verify Velero Setup


Step 11: Download the Velero CLI

  1. Visit the Velero GitHub releases page or use the following command to download the latest release (replace the version if necessary):
curl -LO https://github.com/vmware-tanzu/velero/releases/download/v1.15.0/velero-v1.15.0-linux-amd64.tar.gz

This will download the tar.gz file containing the Velero CLI.

  1. Extract the downloaded file:
tar -xvf velero-v1.15.0-linux-amd64.tar.gz

This will extract the Velero binary (velero) into the current directory.

  1. Move the velero binary to a directory included in your system's $PATH, such as /usr/local/bin:
sudo mv velero-v1.15.0-linux-amd64/velero /usr/local/bin/
  1. Ensure the velero binary is executable:
sudo chmod +x /usr/local/bin/velero

  1. Verify that Velero is correctly installed by running:
velero version

This should display the Velero version and confirm that the CLI is working.


Step:12 Deploy the Application Resources

Create a YAML file (spring-mongo.yaml) containing the deployment, service, PVC, PV, secrets, and config maps for your application.

First, we’re going to deploy our Spring Boot application along with MongoDB. These are the resources we’ll be backing up and restoring later. Let’s apply the YAML file and check if everything is up and running."

  • Commands:

      kubectl apply -f spring-mongo.yaml
    

  • Verify Deployed Resources

  • Verify all resources are created successfully:

      kubectl get all
      kubectl get pvc
      kubectl get pv
      kubectl get cm 
      kubectl get secret
    

    Notice that our application is live, and the services are exposed properly.

  • Access the application via the LoadBalancer External IP:

    Visit the External IP of the springapp service in your browser.


Step 13: Add a User to Validate Data

  • Open the application in your browser and register a new user.
    Example: Niranjanreddy Kandlagunta with the email niranjanreddy123@gmail.com.

  • Verify that the user appears under Saved Users in the UI.

The MongoDB database is persisting the data properly.


Step 14:Run Backup Commands

After installing the Velero CLI, you should be able to run:

To create a backup:

velero backup create mybackup
  • It will back up all namespaces and all resources in the cluster. But it won't back up the CRDs unless you explicitly tell Velero to include them.

  • To include CRDs in your backup, you need to run:

velero backup create mybackup --include-resources=customresourcedefinitions.apiextensions.k8s.io

Verify Backup Details and Resources Count

Once the backup is completed, it's important to verify the details and understand what has been backed up. Velero provides a clear summary of the resources and volumes included in the backup.

velero backup describe mybackup

Key Observations:

  • Phase: The backup phase should indicate Completed, confirming a successful backup.

  • Total Items to Be Backed Up: Displays the total number of resources included in the backup (e.g., 439 in this case).

  • Backup Volumes: Includes information about persistent volumes associated with your cluster.

From the example provided:

  • Started: The timestamp when the backup started.

  • Completed: The timestamp when the backup finished.

  • Resources: A total of 439 resources were successfully backed up, ensuring comprehensive coverage

    of all configurations and data.


Step 15: Verify Backup Storage in S3 Bucket

Before simulating a disaster, it's crucial to verify that the backup is successfully stored in your S3 bucket. This ensures that the data is safe and can be restored if needed.

Steps to Check S3 Bucket Backup:

  1. Locate the S3 Bucket:
    Find the bucket name you specified during Velero configuration. For example, if you named your bucket velero-backups-niranjan, locate it in the bucket list.

  2. Navigate to the Backup Folder:
    Inside the bucket, check for a folder named backups or the name associated with your Velero

In the S3 bucket, after running the velero backup create mybackup command, the structure will look like this:

This ensures that everything required to restore the cluster is safely stored in the S3 bucket, making it easy to recover from a disaster.

/velero-backups/                 # Main bucket
   └── backups/
       └── mybackup/             # Folder for the specific backup
           ├── resources/        # Kubernetes manifests
           ├── volume-snapshots/ # Persistent Volume data
           ├── metadata.json     # Metadata file
           └── backup.log        # Logs of the backup operation


Step 16: Simulate a Disaster (Delete All Resources)

Now that you’ve verified the backup is securely stored in S3, let’s simulate a disaster scenario to validate the restoration process.

Command to Delete All Resources:

kubectl delete all --all

What This Does:

  • Deletes all Kubernetes resources in the current namespace.

  • Simulates a disaster where all cluster data is lost.

  • Verify that all resources are deleted:

kubectl get pods


Step 17: Restore the Application

  • Restore the backup taken earlier using Velero.
velero restore create --from-backup mybackup

Validate the Restoration

  1. Ensure all Kubernetes resources are restored:
kubectl get all
  1. Access the application again via the LoadBalancer External IP.

  1. Confirm that the user (Niranjan Kandlagunta) is still present under Saved Users in the UI.

After restoring with Velero, it creates a restore log file in the S3 bucket under the restore/ folder. This file contains details like the restore name, backup used, time, status, and any warnings or skipped resources.

This file contains details like the restore name, backup used, time, status, and any warnings or skipped resources.


Automating Backups with Cron Jobs (Optional)

After learning how to manually back up and restore Kubernetes resources with Velero, you can automate the process using cron jobs. This allows you to schedule backups at specific intervals, ensuring regular and automated data protection, especially in production environments.

Steps to Create Scheduled Backups with Velero Cron Jobs:

  1. Create a Scheduled Backup for Specific Namespace

    If you want to create a scheduled backup for a specific namespace (e.g., my-namespace), you can use the following command. This example will run the backup every day at midnight (00:00):

     velero schedule create mybackup-schedule --schedule="0 0 * * *" --include-namespaces=my-namespace
    

    Explanation:

    • mybackup-schedule: The name of the backup schedule.

    • --schedule="0 0 * * *": The cron expression specifying when the backup will run. In this case, it will run at midnight every day.

    • --include-namespaces=my-namespace: Specifies that only the resources in my-namespace will be backed up.

  2. Create a Scheduled Backup for All Namespaces

    If you want to back up all namespaces in the cluster, you can modify the command like this:

     velero schedule create mybackup-schedule --schedule="0 0 * * *" --include-namespaces=all
    

    This command will back up all namespaces at midnight every day.

  3. Create a Scheduled Backup for CRDs Only

    In case you only want to back up Custom Resource Definitions (CRDs), use the following command. This example will back up the CRDs every day at 2 AM:

     velero schedule create crd-backup-schedule --schedule="0 2 * * *" --include-resources=customresourcedefinitions.apiextensions.k8s.io
    

    Explanation:

    • crd-backup-schedule: The name of the scheduled CRD backup.

    • --schedule="0 2 * * *": The cron expression specifying that the backup will run at 2 AM every day.

    • --include-resources=customresourcedefinitions.apiextensions.k8s.io: Specifies that only CRDs will be backed up.

  4. List All Scheduled Backups

    You can view the details of all scheduled backups with the following command:

     velero schedule get
    

    This will list all backup schedules, including their names, the cron schedule, and other relevant details.

  5. Delete a Scheduled Backup

    If you no longer need a scheduled backup, you can delete it with this command:

     velero schedule delete mybackup-schedule
    

Conclusion

Velero allows you to automate backups using cron jobs, ensuring your Kubernetes data and all cluster resources are regularly protected without manual effort. You can schedule backups for specific namespaces, all namespaces, or all resources in the cluster, including CRDs. Additionally, manual backups can be created as needed. Combining both manual and automated backups offers flexibility and ensures the safety and easy restoration of all your cluster's resources.

Thankyou : )