Upgrade Instructions

Table of Contents

Helm upgrade command
Version Specific Instructions
Upgrade to 4.2.1
- Pre-Upgrade Steps
Upgrade to 4.2.0
- Pre-Upgrade Steps
- Post-Upgrade Steps
Upgrade to 4.1.0
- Pre-Upgrade Steps
- Post-Upgrade Steps
Upgrade to 4.0.2
- Pre-Upgrade Steps
4.0.1
- Pre-Upgrade Steps
- Post-Upgrade Steps
4.0.0
- Pre-Upgrade Steps
- Post-Upgrade Steps
3.5.3-p1
3.5.3
- Pre-Upgrade Steps
- Post-Upgrade Steps
3.5.2-p2
3.5.2-p1
3.5.2
- Pre-Upgrade Steps
- Post-Upgrade Steps
3.5.1-p2
3.5.1-p1
3.5.1
3.5.0
- Pre-Upgrade Steps
3.4.4 - p1
- Important Notes
- Pre-Upgrade Steps
- Phase 1: Deploy Both Legacy Kafka and Kafka-Kraft
- Phase 2: Switch Ingester to Kafka-Kraft
- Phase 3: Switch All Services to Kafka-Kraft
- Post-Upgrade Steps
3.4.3
- Pre-Upgrade Steps
- Post-Upgrade Steps
3.4.2-p1
3.4.2
- Pre-Upgrade Steps
- Post-Upgrade Steps
3.4.1
- Pre-Upgrade Steps
- Post-Upgrade Steps
3.4.0 - p2
3.4.0 - p1
- Pre-Upgrade Steps
- Post-Upgrade Steps
3.4.0
- Pre-Upgrade and Post-Upgrade Steps
3.3.6
3.3.5
3.3.4
3.3.3
3.3.2
3.3.1
- Pre-Upgrade Steps
3.3.0
- Pre-Upgrade Steps

Follow these step-by-step instructions to upgrade Kloudfuse using Helm, including the upgrade command and version-specific notes for all supported releases.

Helm upgrade command

Before performing an upgrade, validate that the upgrade won’t revert any customization on your cluster.
To check which Kloudfuse version you have, run the following command:
```
helm list
```

Run the upgrade command.

helm upgrade --install kfuse oci://us-east1-docker.pkg.dev/mvp-demo-301906/kfuse-helm/kfuse \
  -n kfuse \
  --version <VERSION> \ (1)
  -f custom-values.yaml

1	Replace `<VERSION>` with a valid Kloudfuse release value; use the most recent one.

Version Specific Instructions

Upgrade to 4.2.1

Pre-Upgrade Steps

Add kf_metrics_staging_topic Kafka Topic

Release 4.2.1 requires a new Kafka topic. Before upgrading, add the following entry to global.kafkaTopics in your custom_values.yaml:

global:
  kafkaTopics:
    - name: kf_metrics_staging_topic
      partitions: 1
      replicationFactor: 1

yaml

Upgrade to 4.2.0

Pre-Upgrade Steps

Metric Shaping Kafka Topic

Release 4.2.0 introduces the Metric Shaping Rules feature, which requires the kf_metrics_staging_topic Kafka topic. Before upgrading, ensure that this topic is included in your global.kafkaTopics section in the custom-values.yaml file:

        - name: kf_metrics_staging_topic
          partitions: <number-of-nodes>
          replicationFactor: 1

yaml

Set the number of partitions to the number of nodes in your cluster.

Archive Writer Kafka Topic (Conditional)

If you plan to enable the new archive writer (see Archive Writer Disabled by Default), ensure that the kf_logs_archive_v2_topic Kafka topic is included in your global.kafkaTopics section in the custom-values.yaml file:

        - name: kf_logs_archive_v2_topic
          partitions: <number-of-nodes>
          replicationFactor: 1

yaml

Set the number of partitions to the number of nodes in your cluster.

AWS CloudWatch Regions (Breaking Change)

Starting in Release 4.2.0, the awsRegions list in Helm values defaults to an empty list. Customers using AWS CloudWatch metrics scraping must now explicitly specify the regions they want to scrape (opt in) — either globally under ingester.config.awsRegions, or per scraper in a specific awsRoleArns entry’s regions list. See Helm Values for details.

Post-Upgrade Steps

Query Scheduling Enabled by Default

Query scheduling is enabled by default in Release 4.2.0. Administrators can configure per-user, per-group, and per-service-account query priorities — including the new blocked priority, which rejects matching queries instead of delaying or queueing them — from the admin UI. To disable query scheduling entirely, set the following in your custom_values.yaml:

global:
  queryScheduler:
    enabled: false

yaml

Archive Writer Disabled by Default

The new archive writer is disabled by default. It can be turned on per deployment when you are ready — contact Kloudfuse support to opt in. Enabling it requires the kf_logs_archive_v2_topic Kafka topic; see Archive Writer Kafka Topic (Conditional) in the pre-upgrade steps.

Scheduled Searches Disabled by Default

Scheduled searches are disabled by default and can be enabled on request — contact Kloudfuse support to enable the feature for your tenant.

Upgrade to 4.1.0

Pre-Upgrade Steps

Set events-query-service.config.rbacdb.user (Externally-Managed Postgres Only)

If your configDB is hosted on an externally-managed Postgres instance (AWS RDS, GCP Cloud SQL, Azure Database for PostgreSQL) where the application user is not the default postgres, you must add an explicit events-query-service.config.rbacdb.user override to your custom_values.yaml. The events-query-service chart currently ships with a hardcoded rbacdb.user: postgres default that masks the global.configDB.username fallback, so without this override events-query-service connects to rbacdb as postgres and fails with no pg_hba.conf entry for host …, user "postgres", database "rbacdb". Skip this step if global.configDB.username is postgres (the default).

events-query-service:
  config:
    rbacdb:
      user: <your-configdb-app-username>  # match global.configDB.username

yaml

This step will not be required once events-query-service v0.1.0-<TBD> ships the chart-side fix; the override can be removed in a follow-up release.

Post-Upgrade Steps

Migrate Ingestion Auth Keys to the UI (Optional, Recommended)

Starting in 4.1.0, ingestion API keys and their optional additional labels can be managed from the Kloudfuse UI. Open Admin > Settings, then click Configure on the Auth key labels card. The legacy YAML configuration (kfuse-auth-ingest secret + ingester.config.authKeyAdditionalLabels) continues to work but is deprecated. YAML-sourced entries appear in the UI as read-only.

After upgrading, recreate each YAML-managed entry through the UI (reusing the existing token to avoid agent reconfiguration), then delete the legacy YAML configuration:

kubectl delete secret kfuse-auth-ingest -n <namespace> (1)

1	Replace `<namespace>` with the namespace of your Kloudfuse deployment. Remove the `ingester.config.authKeyAdditionalLabels` block from `custom_values.yaml` and re-apply Helm values, then restart `config-mgmt-service`: `kubectl rollout restart deploy/config-mgmt-service -n <namespace> (1)`
2	Replace `<namespace>` with the namespace of your Kloudfuse deployment. For full migration steps and a placeholder helper script, see Migrate YAML-managed Auth Keys to the UI.

Upgrade to 4.0.2

Pre-Upgrade Steps

No special pre-upgrade steps are required for this release.

4.0.1

Pre-Upgrade Steps

TLS Certificate Key Type Check (Required for Envoy Gateway deployments)

Release 4.0.1 upgrades the Envoy proxy to v1.36 with FIPS-compliant cryptography. The FIPS proxy rejects RSA private keys for TLS certificates delivered via SDS. All TLS secrets should be checked and any RSA certificates should be recreated as ECDSA.

If your deployment uses AWS ACM (tls.awsAcmEnabled: true), TLS is terminated at the NLB and Envoy never loads the listener certificate. You only need to check the internal envoy certificates (steps 1-2), not the listener certificate.

Scan all TLS secrets for RSA keys:

for secret in $(kubectl get secrets -n <namespace> --field-selector type=kubernetes.io/tls -o jsonpath='{.items[*].metadata.name}'); do
  KEY_TYPE=$(kubectl get secret $secret -n <namespace> -o jsonpath='{.data.tls\.key}' | base64 -d | openssl pkey -noout -text 2>/dev/null | head -1)
  if echo "$KEY_TYPE" | grep -q "2048\|4096\|3072"; then
    echo "RSA: $secret — needs recreation"
  else
    echo "ECDSA: $secret — OK"
  fi
done (1)

1	Replace `<namespace>` with the namespace of your Kloudfuse deployment.

Delete internal envoy certificates (certgen certs):

These are regenerated automatically by the certgen hook during the helm upgrade. The 4.0.1 certgen generates ECDSA P-256 keys.
```
kubectl delete secret envoy envoy-gateway envoy-rate-limit -n <namespace> (1)
```
1 Replace <namespace> with the namespace of your Kloudfuse deployment. Some of these secrets may not exist depending on your configuration — errors for missing secrets can be ignored.
Proceed with the helm upgrade to 4.0.1.

Post-Upgrade Steps

TLS Listener Certificate Reissue (Required for cert-manager deployments)

After the upgrade, the Gateway resource has ECDSA annotations (cert-manager.io/private-key-algorithm: ECDSA). If your listener TLS certificate was RSA (identified in the pre-upgrade check), it must be deleted and reissued as ECDSA.

Skip this section if you use AWS ACM (tls.awsAcmEnabled: true) or if your listener certificate was already ECDSA.

Verify the Gateway has ECDSA annotations:

kubectl get gateway kfuse -n <namespace> -o yaml | grep -i 'private-key' (1)

1	Replace `<namespace>` with the namespace of your Kloudfuse deployment. You should see: `cert-manager.io/private-key-algorithm: ECDSA cert-manager.io/private-key-size: "256"` yaml If these annotations are not present, add them manually: `kubectl annotate gateway kfuse -n <namespace> \ cert-manager.io/private-key-algorithm=ECDSA \ cert-manager.io/private-key-size="256" --overwrite (2)`
2	Replace `<namespace>` with the namespace of your Kloudfuse deployment. If you have an internal gateway (`kfuse-internal`), repeat for that gateway as well.

Delete the listener TLS certificate:

kubectl delete secret <tls-secret-name> -n <namespace> (1)

1	Replace `<tls-secret-name>` with your TLS secret name (default is `letsencrypt-ingress-tls`). Check your `tls.secretName` value if you use a custom name. cert-manager will automatically reissue the certificate as ECDSA within 30-60 seconds.

Verify the new certificate is ECDSA:

kubectl get secret <tls-secret-name> -n <namespace> \
  -o jsonpath='{.data.tls\.key}' | base64 -d | \
  openssl pkey -noout -text 2>/dev/null | head -1 (1)

1	Replace `<tls-secret-name>` and `<namespace>`. Should show `Private-Key: (256 bit)` (ECDSA P-256).

Restart the Envoy proxy to load the new certificate:
```
kubectl rollout restart deploy/envoy -n <namespace> (1)
```
1 Replace <namespace> with the namespace of your Kloudfuse deployment. If you have an internal envoy (envoy-internal), restart that as well.
Verify HTTPS is working:
```
curl -sk https://<your-hostname> -o /dev/null -w "%{http_code}"
```
Expected response: 401 (redirects to login).

If you use a BYOC (Bring Your Own Certificate) RSA cert, cert-manager cannot reissue it. You must obtain an ECDSA P-256 certificate from your certificate authority and update the Kubernetes TLS secret, or switch to AWS ACM for TLS termination.

Why is this needed?

The Envoy proxy FIPS build uses SafeLogic CryptoComply for cryptographic operations. When a TLS certificate is delivered to the proxy via SDS (Secret Discovery Service), the private key is imported into the FIPS cryptographic module. SafeLogic’s implementation runs a Pairwise Consistency Test (PCT) on RSA key import, which fails. ECDSA keys are not affected.

Internal certificates used for communication between Envoy components (certgen/xDS certs) are loaded from files and work with RSA, but are recreated as ECDSA for consistency.

TLS Setup	Envoy loads private key via SDS?	Action needed?
AWS ACM (NLB terminates TLS)	No	Pre-upgrade only (certgen certs)
cert-manager (Envoy terminates TLS)	Yes	Pre-upgrade + Post-upgrade (all certs)
BYOC (Envoy terminates TLS)	Yes	Pre-upgrade + reissue ECDSA cert from CA

TLS Setup

Envoy loads private key via SDS?

Action needed?

AWS ACM (NLB terminates TLS)

Pre-upgrade only (certgen certs)

cert-manager (Envoy terminates TLS)

Yes

Pre-upgrade + Post-upgrade (all certs)

BYOC (Envoy terminates TLS)

Yes

Pre-upgrade + reissue ECDSA cert from CA

4.0.0

Pre-Upgrade Steps

Hydration Service Deployment change (Required): K8s deployment type for hydration-service has changed from Deployment to StatefulSet. Before upgrading to 4.0, delete the existing Deployment and ConfigMap.

kubectl delete deployment -n <namespace> hydration-service (1)
kubectl delete cm -n <namespace> hydration-service (1)

1	Replace `<namespace>` with the namespace of your Kloudfuse deployment.

Envoy Gateway Migration (Optional): Starting in 4.0.0, Kloudfuse supports Envoy Gateway as an alternative to NGINX Ingress. Existing deployments can optionally migrate using a zero-downtime 3-step migration process. See Configure Envoy Ingress for details.

AWS CloudWatch Namespaces (Breaking Change)

Starting in 4.0.0, the awsNamespaces list in Helm values defaults to an empty list. Customers using AWS CloudWatch metrics scraping must explicitly list the namespaces they want to scrape. Before upgrading, add the desired namespaces under ingester.config.awsNamespaces in your Helm values override:

ingester:
  config:
    awsNamespaces:
      - "AWS/EC2"
      - "AWS/RDS"
      - "AWS/S3"
      - "AWS/Lambda"
      # Add other namespaces as needed

yaml

For the full list of supported namespaces, see AWS Supported Services.

Two new namespaces are now supported: AWS/Bedrock and AWS/QBusiness. If you enable these, ensure your AWS IAM scraper role policy includes the required permissions: bedrock:ListFoundationModels, bedrock:ListTagsForResource, qbusiness:ListApplications, qbusiness:GetApplication, qbusiness:ListTagsForResource.

Post-Upgrade Steps

Pinot STS Rollout Restart: After the upgrade, wait for the setup-pinot job to complete, then perform a rollout restart of the Pinot STS (StatefulSet) to pick up the updated schema and table configuration.

kubectl rollout restart statefulset/pinot-server-offline
kubectl rollout restart statefulset/pinot-server-realtime
kubectl rollout restart statefulset/pinot-broker
kubectl rollout restart statefulset/pinot-controller
kubectl rollout restart statefulset/pinot-minion

3.5.3-p1

There are no specific pre-upgrade or post-upgrade steps for upgrading to Release 3.5.3-p1.

3.5.3

Pre-Upgrade Steps

Metrics Transformer GOMEMLIMIT

Multi-resolution rollup increases memory usage in the metrics transformer. Review and increase the GOMEMLIMIT setting for the metrics transformer if needed, especially for high-cardinality deployments.

Kafka Rollup Topic

Metrics rollup is enabled by default in 3.5.3. Before upgrading, ensure that kf_metrics_rollup_topic is included in your Kafka topics list in the Helm values. Specify the same number of partitions as the existing kf_metrics_topic.

RUM Vitals Kafka Topic (Conditional)

If RUM is enabled in your deployment, ensure that kf_rum_vitals_topic is included in your Kafka topics list in the Helm values. Specify the same number of partitions as the existing kf_rum_views_topic.

Legacy Rollup Interval (Conditional)

If you were using a non-default rollup interval before 3.5.3 (for example, 600s instead of the default 300s), set the legacy interval so that pre-3.5.3 rollup data is attributed to the correct resolution:

global:
  metrics:
    legacyRollupIntervalSecs: 600

yaml

Post-Upgrade Steps

Multi-Rollup Resolution: After the upgrade, wait for the setup-pinot job to complete, then restart the Pinot STS (StatefulSet) to pick up the new rollup schema and table configuration. Existing metrics data is automatically compatible with multi-rollup resolution.
Pod Security Configuration (Optional): Review your Helm values if you use custom security configurations. All services now run as non-root users by default and support configurable service accounts and security context. Existing deployments without custom security settings are unaffected.
Scheduled Views: Scheduled views have been redesigned. You must recreate your scheduled views after upgrading. Data continuity from the previous implementation cannot be guaranteed.

3.5.2-p2

There are no specific pre-upgrade or post-upgrade steps for upgrading to Release 3.5.2-p2.

3.5.2-p1

There are no specific pre-upgrade or post-upgrade steps for upgrading to Release 3.5.2-p1.

3.5.2

Pre-Upgrade Steps

There are no specific pre-upgrade steps for upgrading to Release 3.5.2.

Post-Upgrade Steps

Container Image Signature Verification (Optional)

Starting with 3.5.2, all Kloudfuse container images and Helm charts are signed. You can optionally verify image signatures before deployment.

Folder Permissions

If you use folder-based organization for dashboards and alerts, review folder permissions after upgrade to ensure appropriate access levels are configured.

Logs Parser Restart

After upgrading to 3.5.2, you must restart the logs-parser to ensure proper functionality.

The following commands use kfuse as the default namespace. Replace with your actual namespace if different.

Scale down the logs-parser:

kubectl scale sts logs-parser -n kfuse --replicas=0

bash

Verify that all logs-parser pods are terminated before proceeding:
```
kubectl get pods -n kfuse -l app=logs-parser
```
bash
Scale up the logs-parser to match the numNodes value configured in your custom values YAML:
```
kubectl scale sts logs-parser -n kfuse --replicas=<numNodes>
```
bash

3.5.1-p2

There are no specific pre-upgrade or post-upgrade steps for upgrading to Release 3.5.1-p2.

3.5.1-p1

There are no specific pre-upgrade or post-upgrade steps for upgrading to Release 3.5.1-p1.

3.5.1

There are no specific pre-upgrade or post-upgrade steps for upgrading to Release 3.5.1.

3.5.0

Pre-Upgrade Steps

To support improved management of RUM session recordings, ensure that your global.kafkaTopics section in the custom-values.yaml file contains the following additional topic:

        - name: kf_rum_expired_sessions_topic
          partitions: 1
          replicationFactor: 1

yaml

3.4.4 - p1

This guide covers upgrading to version 3.4.4-p1. This guide can be used for upgrading from 3.4.3 to 3.4.4-p1 and for upgrading from 3.4.4 to 3.4.4-p1; the only difference is in Phase 1 configuration.

Important Notes

Indentation Matters

The kafka-kraft section must be at the same indentation level as the kafka section (root level), NOT under the global section.

Disk Size

Always copy the persistence disk size from your existing kafka broker to the kafka-kraft broker configuration.

Version Consistency

Use version 3.4.4-p1 for all helm upgrade commands throughout the process.

Scripts

The following scripts referenced in this guide are available at https://github.com/kloudfuse-ext/customer/tree/main/scripts:

pause_consumption.sh - Pauses Pinot consumption on all tables
resume_consumption.sh - Resumes Pinot consumption on all tables
get_consuming_segments_info.sh - Gets current status of consuming segments

Pre-Upgrade Steps

Update kfuse-vector Configuration

The kfuse-vector component has been renamed to kfuse-archival-vector. If your values.yaml contains a kfuse-vector section, you must rename it before upgrading to 3.4.4:

# Old configuration (3.4.3 and earlier)
kfuse-vector:
  <your-configuration>

# New configuration (3.4.4 and later)
kfuse-archival-vector:
  <your-configuration>

yaml

Phase 1: Deploy Both Legacy Kafka and Kafka-Kraft

Deploy both legacy kafka and kafka-kraft services, but continue using legacy kafka for all operations.

Update custom_values.yaml

Add the three legacy flags under the global.kafka section:

global:
  kafka:
    deployLegacy: true
    useLegacy: true
    ingesterUseLegacy: true

yaml

Configure Kafka Services

Add the kafka-kraft section at the same indentation level as the kafka section (NOT under global).

Ensure that the existing kafka.broker disk size is copied to kafka-kraft.broker. For example, if your existing kafka has persistence of 200Gi, copy it to the kafka-kraft section. If you are using a custom storageClass for kafka-broker instead of kfuse-ssd, please include it in the kafka-kraft.broker section and create a section for kafka-kraft.controller with storageClass.

If upgrading from 3.4.3

kafka:
  broker:
    persistence:
      size: 200Gi
      storageClass: <storage-class-name> #Optional

kafka-kraft:
  broker:
    persistence:
      size: 200Gi
      storageClass: <storage-class-name> #Optional

#Optional
  controller:
    persistence:
      storageClass: <storage-class-name>

yaml

Run Helm Upgrade

helm upgrade -n kfuse kfuse oci://us-east1-docker.pkg.dev/mvp-demo-301906/kfuse-helm/kfuse -f custom_values.yaml --version 3.4.4-p1

Wait for Deployment

Wait for kafka-kraft-broker and kafka-kraft-controller pods to be up and running, and for the kafka topic creator job to finish.

Phase 2: Switch Ingester to Kafka-Kraft

Switch the ingester to use the new kafka-kraft by removing ingesterUseLegacy from custom_values.yaml.

Update custom_values.yaml

Remove only the ingesterUseLegacy flag from the global.kafka section. The kafka and kafka-kraft sections remain unchanged:
```
global:
  kafka:
    deployLegacy: true
    useLegacy: true
```
yaml

Run Helm Upgrade

helm upgrade -n kfuse kfuse oci://us-east1-docker.pkg.dev/mvp-demo-301906/kfuse-helm/kfuse -f custom_values.yaml --version 3.4.4-p1

Check Kafka Consumer Lag

Check kafka consumer lag on kafka-broker-0 by running the below code snippet. Output of the code will show multiple topics with multiple columns; once the values in the lag column is all zero, then move onto the next step.
```
kubectl exec -ti -n kfuse kafka-broker-0 -- bash
unset JMX_PORT
/opt/bitnami/kafka/bin/kafka-consumer-groups.sh \
  --bootstrap-server :9092 --describe --all-groups
```
Pause Pinot Consumption

Pause Pinot consumption by first port-forwarding to pinot-controller-0 and then running the pause_consumption.sh script:
```
kubectl port-forward -n kfuse pinot-controller-0 9000:9000
```
Then run the pause_consumption.sh script.

Wait for Segment Sealing

Run get_consuming_segments_info.sh (pinot-controller needs to be port forwarded) to get the current status. To continue with the upgrade the segments need to be sealed, which can be verified if the map for _segmentToConsumingInfoMap element doesn’t contain any element in {}, as shown below.

Example output when segments are sealed:

~/get_consuming_segments_info.sh
Fetching realtime tables...
Found tables:
kf_events_REALTIME
kf_logs_REALTIME
kf_logs_views_REALTIME
kf_metrics_REALTIME
kf_metrics_rollup_REALTIME
kf_rum_actions_REALTIME
kf_rum_errors_REALTIME
kf_rum_longtasks_REALTIME
kf_rum_resources_REALTIME
kf_rum_views_REALTIME
kf_traces_REALTIME
kf_traces_errors_REALTIME

Getting consuming segments info for: kf_events
  (from kf_events_REALTIME)
{"serversFailingToRespond":0,
 "serversUnparsableRespond":0,
 "_segmentToConsumingInfoMap":{}}

Getting consuming segments info for: kf_logs
  (from kf_logs_REALTIME)
{"serversFailingToRespond":0,
 "serversUnparsableRespond":0,
 "_segmentToConsumingInfoMap":{}}

Getting consuming segments info for: kf_logs_views
  (from kf_logs_views_REALTIME)
{"serversFailingToRespond":0,
 "serversUnparsableRespond":0,
 "_segmentToConsumingInfoMap":{}}

Phase 3: Switch All Services to Kafka-Kraft

Switch all other services to use kafka-kraft.

Update custom_values.yaml

Remove the global.kafka and kafka sections from custom_values.yaml. Only the kafka-kraft section is needed. The default helm configuration for 3.4.4-p1 already uses the new kafka for all services.

Run Helm Upgrade

helm upgrade -n kfuse kfuse oci://us-east1-docker.pkg.dev/mvp-demo-301906/kfuse-helm/kfuse -f custom_values.yaml --version 3.4.4-p1

Re-enable Pinot Consumption

Once the setup-pinot job has completed, re-enable pinot consumption on all tables by running resume_consumption.sh.

Post-Upgrade Steps

Let the New Kafka-Kraft Bake for 24hrs

After successful migration and a waiting period of 24hrs, the legacy kafka-broker and kafka-zookeeper PVCs should be deleted:

kubectl get pvc -n kfuse | grep kafka-zookeeper
# Add the pvc names for all kafka-zookeeper
kubectl delete pvc data-kafka-zookeeper-0

kubectl get pvc -n kfuse | grep kafka-broker
# Add the pvc names for all kafka-broker instances
kubectl delete pvc data-kafka-broker-0

3.4.3

Pre-Upgrade Steps

If you plan to use GCP Stackdriver metrics and enrichment features, create a GCP service account secret before upgrading.

Follow the instructions at GCP Metrics — Service Account to create a service account with the required permissions.

Create the secret in your Kubernetes cluster:

kubectl create secret generic kfuse-gcp-credentials \
  --from-file=key.json=<path-to-service-account-json>

Configure the secret name in your values.yaml:

global:
  gcpConfig:
    secretName: "kfuse-gcp-credentials"

yaml

Post-Upgrade Steps

After upgrading to 3.4.3, perform a rolling restart of all Pinot components to ensure proper initialization:

kubectl rollout restart statefulset -l app=pinot

Verify all Pinot pods are running:

kubectl get pods -l app=pinot

3.4.2-p1

There are no specific pre-upgrade or post-upgrade steps for upgrading to Release 3.4.2-p1.

3.4.2

Pre-Upgrade Steps

Starting with 3.4.2, the AZ service is enabled by default. To ensure a successful upgrade, configure the cloudStorage section in your values.yaml file.
You can define storage either:
- At the service level (pinot.deepStore or az-service.cloudStore)
- At the global cloudStorage section
  
  Service-level settings always take precedence. If both are present, the upgrade continues to work as is. We recommend consolidating into the global cloudStorage section for consistency across services.
Configure the storage backend. Supported types are s3, gcs, and azure:

global:
  cloudStorage:
    # Supported types: s3, gcs, azure
    type: s3
    useSecret: true
    secretName: cloud-storage-secret

    # S3-specific
    s3:
      region: <specify region>
      bucket: <specify bucket>

    # GCS-specific
    gcs:
      bucket: <specify bucket>

    # Azure-specific
    azure:
      container: <specify container>

yaml

If you use secrets for authentication, create them outside of Kloudfuse using kubectl:

S3 – secret must include accessKey and secretKey:

kubectl create secret generic cloud-storage-secret \
  --from-literal=accessKey=<accessKey> \
  --from-literal=secretKey='<secretKey>'

GCS – secret must include the JSON credentials file (saved as secretKey):
```
kubectl create secret generic cloud-storage-secret \
  --from-file=./secretKey
```

Azure – secret must include the storage account connectionString:

kubectl create secret generic cloud-storage-secret \
  --from-literal=connectionString=<connectionString>

If Pinot was previously configured with deepStore, migrate it:
- Remove the cloud storage configuration from pinot deepStore section
- Replace dataDir with prefix in the service section.
- The bucket name goes to the global config; everything after the bucket path becomes the prefix.
  
  Example: If dataDir was:

s3://kfuse-bucket/pisco/controller/data

Set:

global:
  cloudStorage:
    type: s3
    s3:
      bucket: kfuse-bucket

pinot:
  deepStore:
    enabled: true
    prefix: pisco/controller/data

yaml

Post-Upgrade Steps

No additional steps are required after the upgrade.

3.4.1

Pre-Upgrade Steps

Update the Pinot configuration in your deployment YAML to use jvmMemory

Post-Upgrade Steps

Restart Pinot to apply any configuration changes:

kubectl rollout restart sts -n kfuse pinot-server-realtime pinot-controller pinot-broker

bash

The default namespace is kfuse. If your deployment uses a different namespace, replace kfuse with the appropriate namespace.

3.4.0 - p2

There are no specific pre-upgrade or post-upgrade steps for upgrading to the Release 3.4.0.p2.

3.4.0 - p1

Pre-Upgrade Steps

We changed the minion pvc disk size default. To successfully upgrade to this version, delete the Pinot minion StatefulSet and its PVC by running:

kubectl delete sts -n <namespace> pinot-minion (1)
kubectl delete pvc -l app.kubernetes.io/instance=kfuse -l component=minion -n <namespace> (1)

1	Replace `<namespace>` with the namespace of your Kloudfuse deployment.

Post-Upgrade Steps

After completing the upgrade, run the following command:

kubectl rollout restart deployment -n <namespace> kfuse-grafana (1)

1	Replace `<namespace>` with the namespace of your Kloudfuse deployment.

3.4.0

Pre-Upgrade and Post-Upgrade Steps

Perform the following check before and after upgrading to ensure the admin user configuration is correct:

Verify the admin user configuration in the alerts database:

kubectl exec -it kfuse-configdb-0 -- bin/bash
psql -U postgres -d alertsdb

select * from public.user where login='admin';
select * from public.user where email='admin@localhost';

bash

Both queries should return the same row with id = 1. If they return different IDs, fix it using the following operations:

UPDATE public.user SET id=1 where email='admin@localhost';
DELETE from public.user where id=<ID from the output of the first command>;

sql

Then restart Grafana:

kubectl rollout restart deployment kfuse-grafana

bash

3.3.6

There are no specific pre-upgrade or post-upgrade steps for upgrading to the Release 3.3.6.

3.3.5

There are no specific pre-upgrade or post-upgrade steps for upgrading to the Release 3.3.5.

3.3.4

There are no specific pre-upgrade or post-upgrade steps for upgrading to the Release 3.3.4.

3.3.3

There are no specific pre-upgrade or post-upgrade steps for upgrading to the Release 3.3.3.

3.3.2

There are no specific pre-upgrade or post-upgrade steps for upgrading to the Release 3.3.2.

3.3.1

There are no specific post-upgrade steps for this release.

Pre-Upgrade Steps

If your Kloudfuse configuration has RBAC enabled, you must also enable Audit Logs. Set both feature flags, RBACEnabled and EnableAuditLogs, to true in your yaml configuration file.

Set RBAC and Audit Logs feature flags

global:
...
  RBACEnabled: true
  EnableAuditLogs: true
...

code

3.3.0

There are no specific post-upgrade steps for this release.

Pre-Upgrade Steps

If your organization runs Kloudfuse on a shared cluster, or if it has the az-service enabled (it has taints and labels), update the following configuration in the values.yaml file before upgrading.

config-mgmt-service:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: ng_label
            operator: In
            values:
            - az1
  tolerations:
  - key: "ng_taint"
    operator: "Equal"
    value: "az1"
    effect: "NoSchedule"

code

The configuration for label tracking is now part of the global section. If your organization tracks labels, move their definition to the global section.