Backup and Restore Kloudfuse

These steps describe how to back up and restore your Kloudfuse cluster. Follow the instructions in Backup and Restore carefully to ensure the successful operation of the Kloudfuse platform.

Backup

Configuration Database Backup

Kloudfuse uses a separate az-service to take periodic backups of its databases (configdb, orchestratordb, and Pinot) and store them in cloud storage. This service is available in release 3.4.0 and later, and is enabled by default starting in release 3.4.2.

Configure az-service in your Kloudfuse values file to schedule periodic backups for your cloud provider, as shown in the following example.

# az-service - Configuration for AZ Service
az-service:
  enabled: true
  # replica count for az-service should stay at 1.
  # Override for nodeSelector, affinity, and tolerations is available.

  config:
    zone: "<zone-id>" # unique identifier so that if the bucket is shared, it doesn't get written in the same folder. We can keep it same as orgId as it is unique enough. Defaults to <orgId>-<Release.Name>-<Release.Namespace> if not specified.

    # Cloud storage can be configured here or through the global.cloudStorage section.
    # When set, the az-service cloudStorage configuration takes precedence; otherwise it falls back to global.cloudStorage.
    # By default, az-service uses global.cloudStorage. To override it for az-service only, uncomment and complete the block below.
    # cloudStorage:
      # type of cloud storage: s3, gcs, azure
      # type: s3
      # useSecret: false
      #
      # secretName is the kubernetes secret that contains the credentials to access the specified cloud storage.
      # For s3:
      # The secret contains accessKey and secretKey of the IAM credentials.
      # kubectl create secret generic cloud-storage-secret --from-literal=accessKey=<accessKey> --from-literal=secretKey='<secretKey>'
      # For GCS:
      # The secret contains the encoded content of the json credential file. Note that the file needs to be called secretKey.
      # kubectl create secret generic cloud-storage-secret --from-file=./secretKey
      # For Azure:
      # The secret contains the connectionString for the container storage.
      # kubectl create secret generic cloud-storage-secret --from-literal=connectionString=<connectionString>
      #
      # secretName: cloud-storage-secret
      # s3:
      #   region: <specify region>
      #   bucket: <specify bucket>
      # gcs:
      #   bucket: <specify bucket>
      # azure:
      #   container: <specify container>
yaml

Restore

To restore data from az-service

  1. Port-forward az-service.
    Switch to the cluster context, set the namespace, and forward the az-service port so the curl commands below can reach the service locally. Keep this session running for the duration of the restore.

    kubectl config set-context --current --namespace kfuse
    kubectl port-forward svc/az-service 8080:8080
  2. Activate the restore.
    Restores configdb and orchestratordb from the backup, rehydrates any missing Pinot segments from cloud storage, and resumes Pinot consumption. Alerts are silenced for the duration of the activation.

    curl -X POST "http://localhost:8080/activate?zone=<zone-id>"

    To restore to a specific point in time, add the latestTime parameter:

    curl -X POST "http://localhost:8080/activate?zone=<zone-id>&latestTime=<timestamp>"
    Parameter Required Description

    zone

    Yes

    Zone ID of the backed-up data. IMPORTANT: If the backup uses a different zone than your current cluster configuration, use the zone ID from the backup, not the current cluster’s zone.

    latestTime

    No

    Restore configdb and orchestratordb to this timestamp. Defaults to the most recent available backup.

  3. Monitor rehydration progress.
    Poll the status endpoint after activation to track Pinot rehydration completion.

    curl http://localhost:8080/status
  4. Resume alerts.
    Unsilences alerts once activation is complete and you have verified that the cluster is healthy.

    curl -X POST http://localhost:8080/resume

To restore into a new Kloudfuse cluster, first install Kloudfuse following the Installation Topics instructions, then run the restore steps above. The new cluster must use the same cloud storage configuration (S3, GCS, or Azure) as the backed-up cluster, including the same bucket or container and credentials. Configure the new cluster with a different zone than the original cluster so that its own backups are written to a separate folder and do not overwrite the backed-up data from the previous install.