Upgrade Kloudfuse

Upgrade command

  1. Before performing an upgrade, validate that the upgrade won’t revert any customization on your cluster. See Upgrade validation

  2. To check which Kloudfuse version you have, run the following command:

    helm list
  3. Run the upgrade command.

    helm upgrade --install -n kfuse kfuse \
    oci://us-east1-docker.pkg.dev/mvp-demo-301906/kfuse-helm/kfuse \
    --version <VERSION.NUM.BER>  \ (1)
    -f custom_values.yaml
    1 version: Valid Kloudfuse release value; use the most recent one.

Upgrading to Latest Kloudfuse Releases

3.2.1

There is no specific pre-upgrade or post-upgrade steps for upgrading to the Release 3.2.1.

3.2.0

There is a specific step for upgrading to Release 3.2.0.

Post-upgrade

After the upgrade, restart pinot services:

kubectl rollout restart sts pinot-broker pinot-controller pinot-server-realtime pinot-server-offline

This step takes care of the race condition related to raw index version change.

3.1.3

There is no specific pre-upgrade or post-upgrade steps for upgrading to the Release 3.1.3.

3.1.2

There are specific steps for upgrading to Release 3.1.2.

Pre-Upgrade

Before upgrading to Version 3.1.2 or TOT, run the following command:

kubectl delete deployments.apps catalog-service rulemanager advance-functions-service

3.1.0

Pre-Upgrade

Because of the fix for the labels and labelselector so some of our components can match the rest, you must run this command before upgrading to Release 3.1.0 or TOT.

kubectl delete deployments.apps catalog-service rulemanager advance-functions-service

Post-Upgrade

  1. Restart Pinot Services

    kubectl rollout restart sts pinot-broker pinot-controller pinot-server-realtime pinot-server-offline
  2. We moved hydration-service (HS) from a deployment to statefulset. You must manually delete the pod associated with it.

    k delete pod hydration-service-<tag>

    HS pod now runs under a custom pod name. Use the following clause to fetch it.

    (kubectl get pods | grep hydration-service)

2.7.4

Pre-Upgrade

For RBAC, before upgrading to Release 2.7.4 from Release 2.7.3, check for a blank user row; click the Admin tab, and select User Management. The login and email fields are empty, and the record has a random id. Delete that row directly in the UI.

Alternatively, complete these steps in the console:

  1. Run the kfuse-postres.sh script to enter the configdb shell.

    #!/usr/bin/env bash
    
    # Optional parameters:
    # 1. pod name - default kfuse-configdb-0
    # 2. namespace - default kfuse
    # 3. database name - default configdb
    
    kubectl exec -it ${1:-kfuse-configdb-0} -n ${2:-kfuse} -- bash -c "PGPASSWORD=\$POSTGRES_PASSWORD psql -U postgres -d ${3:-configdb}"
  2. Delete users with null emails and logins.

    ./kfuse-postgres.sh kfuse-configdb-0 kfuse rbacdb
    
    rbacdb=# DELETE FROM users where email ISNULL and login ISNULL;
    DELETE 1

Post-Upgrade

Restart Pinot Services.

kubectl rollout restart sts pinot-server-offline

kubectl port-forward --namespace kfuse deployments.apps/trace-query-service 8080:8080
curl -X POST http://localhost:8080/v1/trace/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "query { refreshServicesInApmStore(lookbackDays: 1) }"
  }'

2.7.3

Upgrade to Release 2.7.3:

  1. In the custom-values.yaml file, set the value pinot.server.realtime.replicaCount to 0.

    Keep note of the original value of this field. You must set it to the original value later.

  2. Run helm upgrade as usual.

  3. Ensure that all pods and jobs are finished successfully.

  4. In the custom-values.yaml file, set the value pinot.server.realtime.replicaCount to its original value.

  5. Run helm upgrade again.

Alternatively, run the following command:

kubectl scale sts pinot-server-realtime --replicas=N

2.7.2

This release changes the RBAC implementation.

  1. You may see numeric IDs in the email field of the users. To populate Kloudfuse with correct emails, delete all users. Kloudfuse recreates individual users as they log in, with correct email values.

  2. Create new groups after completing this step. You can then assign users to groups, policies to users and groups, and so on.

Post-upgrade

  1. Connect to rbacdb.

    > ./kfuse-postgres.sh kfuse-configdb-0 kfuse rbacdb
  2. Make a note of each user_id with null value that resulted from te RBAC migration.

    rbacdb=# select id from users where grafana_id=NULL;
  3. Clean up empty users in the RBAC database.

    rbacdb=# delete from users where grafana_id=NULL;
  4. For each user_id that you noted earlier, delete the user from the group.

    rbacdb=# delete from group_members where user_id='<user-id>';

2.7.1

There is no specific pre-upgrade or post-upgrade steps for upgrading to the Release 2.7.1.

2.7.0

Pre-upgrade

Package upgrades to remove service vulnerabilities.

  1. Before helm upgrade, run the kafka-upgrade.sh script. Expect some downtime between running the script and helm upgrade.

  2. Edit the custom_values.yaml file, and move the block under kafka to the kafka-broker section.

    kafka:
      broker:
        <<previous kafka block>>
    yaml
  3. Add these topics to the kafkaTopics section to ensure record-replay.

      kafkaTopics:
        - name: kf_commands
          partitions: 1
          replicationFactor: 1
        - name: kf_recorder_data
          partitions: 1
          replicationFactor: 1
    yaml
  4. Add a recorder section with the same affinity and toleration values as the ingester. If empty, don’t add the recorder section.

    recorder:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: ng_label
                operator: In
                values:
                - amrut
      tolerations:
      - key: "ng_taint"
        operator: "Equal"
        value: "amrut"
        effect: "NoSchedule"
    yaml
  5. If you use AWS enrichment, the config format in the values changed. See AWS Services.

  6. Upgrade the stack; see Upgrade command.

2.6.8

There is no specific pre-upgrade or post-upgrade steps for upgrading to the Release 2.6.8.

2.6.7

Release 2.6.7 introduces Identity for Databases. It takes effect on newly-ingested APM-related data.

We increased timestamp granularity for APM/span data from millisecond to nanosecond, because it provides better accuracy for the Trace Flamegraph and Waterfall visuals.

Pre-upgrade

SLO

We re-enabled SLO in this release, with enhanced features.

  1. Enable the kfuse-postres.sh script.

  2. Drop the SLO DB.

    > ./kfuse-postgres.sh kfuse-configdb-0 kfuse slodb
    
    slodb=# drop table slodbs;
APM

You must convert older APM data to Kloudfuse 2.6.5 APM Service Identity format.

APM data ingested before Release 2.6.5 is incompatible, and does not render properly in the APM UI page. You have an option to convert the older data to the current format. The conversion process may take time, depending on the volume of data. When enabled, the conversion runs when Pinot servers start, and load the segments.

  1. To enable the conversion, ensure that the custom_values.yaml file has the following configuration:

    pinot:
      traces:
        serviceHashConversionEnabled: true
      traces_errors:
        serviceHashConversionEnabled: true
      metrics:
        serviceHashConversionEnabled: true
    yaml
  2. Disable the KV Cardinality limit on the Pinot Metrics table.

    pinot:
      metrics:
        kvTotalCardinalityThreshold: 0
    yaml
  3. Increase the heap allocation for Pinot Server Offline servers. Segment conversion requires memory. Temporarily double the memory for the Pinot server offline in custom_values.yaml file.

    pinot:
      server:
        offline:
          jvmOpts: "<Adjust the Xmx and Xms settings here>"
    yaml
  4. Reduce the helix threads to 10.

    kubectl port-forward -n kfuse pinot-controller-0 9000:9000
    curl -X POST "http://localhost:9000/cluster/configs" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"STATE_TRANSITION.maxThreads\": \"10\"}"
    # Verify using:
    curl GET "http://localhost:9000/cluster/configs"
  5. Run the standard upgrade command using the updated custom_values.yaml file. See Upgrade command.

Post-upgrade

  1. The upgrade includes changes to Pinot table configuration.

    Restart Pinot servers to ensure that the configuration is updated.

    kubectl rollout restart sts -n kfuse pinot-server-offline pinot-server-realtime
  2. It takes time to convert all Pinot segments. The table segments status in the Pinot controller UI console should reflect the loaded (converted) segments. Connect to Pinot controller to monitor when all segments are in good state; this is when the conversion is complete.

    # Create port-forward to the pinot controller
    kubectl port-forward -n kfuse pinot-controller-0 9000:9000
    # From the browser, go to localhost:9000
  3. After conversion finishes, revert the helix threads back to the default setting.

    kubectl port-forward -n kfuse pinot-controller-0 9000:9000
    curl -X DELETE "http://localhost:9000/cluster/configs/STATE_TRANSITION.maxThreads" -H "accept: application/json"
  4. Revert the cardinality threshold configuration and heap allocation of the Pinot server offline servers in the custom_values.yaml file.

  5. Run the upgrade again. See Upgrade command.

  6. In some special cases, you may have to force a re-conversion of segments before the upgrade, delete the pinot-server-offline STS and PVC, and then run the conversion steps. This forces older segments to download from the deep store.

    kubectl delete sts -n kfuse pinot-server-offline
    kubectl delete pvc -l component=server-offline -n kfuse