Cloud-Provider Managed Services

Overview

By default, Kloudfuse bundles PostgreSQL, Redis, and Kafka within the Helm chart. For production deployments — particularly those requiring compliance, high availability, or multi-zone availability — you can replace these with cloud-provider managed equivalents:

Service Bundled (default) Managed alternative

PostgreSQL 14+ (18+ for FIPS)

kfuse-configdb StatefulSet

AWS RDS, GCP Cloud SQL, Azure Database for PostgreSQL

Redis

kfuse-redis StatefulSet (Sentinel mode)

AWS ElastiCache, GCP Memorystore, Azure Cache for Redis

Kafka

kafka-kraft StatefulSet (KRaft mode)

AWS MSK, Confluent Cloud, Azure Event Hubs for Kafka

Common reasons to use managed services:

  • Compliance — Databases and message brokers must reside on managed, audited infrastructure.

  • Corporate policy — Centralized operations teams manage data services separately.

  • Multi-zone availability — Services must be reachable across availability zones or regions.

  • Operational simplicity — Offload backup, patching, and scaling to the cloud provider.

Managed Database (PostgreSQL)

Kloudfuse uses PostgreSQL for storing configuration, alert rules, RBAC policies, and other platform state across multiple databases (configdb, alertsdb, beffedb, logsconfigdb, and others).

The bundled deployment runs a kfuse-configdb StatefulSet inside the cluster. When switching to a managed PostgreSQL service, you point Kloudfuse at the external endpoint and disable the bundled StatefulSet.

For full details, see Cloud Provider Backend Database.

Requirements

  • PostgreSQL version 14 or later (version 18+ required for FIPS environments)

  • Minimum 2 cores, 2 GB memory, 50 GB disk

  • Port 5432 (the chart uses port 5432; custom ports are not currently supported)

  • A database user with CREATEDB privilege (the user creates and owns all Kloudfuse databases)

  • For FIPS/compliance environments, use a non-default username (not postgres) — see FIPS Installation Guide

  • Network accessible from the Kubernetes cluster

Credentials

Create a Kubernetes secret containing the database password:

export PG_CREDENTIAL="YOUR_PASSWORD"
kubectl create secret generic kfuse-pg-credentials \
  --from-literal=postgres-password="$PG_CREDENTIAL" \
  --from-literal=postgresql-password="$PG_CREDENTIAL" \
  --from-literal=postgresql-replication-password="$PG_CREDENTIAL"
unset PG_CREDENTIAL
Replace YOUR_PASSWORD with the actual password. Be aware of how your shell handles special characters. For FIPS environments, use a non-default username (not postgres) — see FIPS Installation Guide for details.

Helm Configuration

Update your custom-values.yaml with the following parameters:

installKfusePgCredentials: false  (1)

global:
  configDB:
    host: "your-rds-endpoint.rds.amazonaws.com"  (2)
    username: "<your-db-username>"  (3)
  orchestratorDB:
    host: "your-rds-endpoint.rds.amazonaws.com"  (2)
    username: "<your-db-username>"  (3)

kfuse-configdb:
  enabled: false  (4)

ingester:
  postgresql:
    enabled: false  (5)
yaml
1 Disable auto-creation of the PostgreSQL credentials secret (you created it manually above).
2 DNS endpoint of your managed PostgreSQL instance.
3 The database username. For FIPS environments this must be a non-default user — see FIPS Installation Guide for the full list of per-service username overrides.
4 Disable the bundled kfuse-configdb StatefulSet.
5 Disable the bundled orchestrator PostgreSQL.

TLS Configuration

To enable SSL/TLS for PostgreSQL connections, create a Kubernetes secret containing the CA certificate used to verify the server’s identity.

For AWS RDS, download the global CA bundle:

curl -o rds-ca-bundle.pem https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem
kubectl create secret generic pg-tls-ca-cert \
  --from-file=ca.crt=rds-ca-bundle.pem

For Google Cloud SQL or Azure Database for PostgreSQL, download the CA certificate from the cloud console and substitute it above.

Then update your custom-values.yaml to enable TLS:

global:
  configDB:
    tls:
      enabled: true
      sslmode: "verify-full"   (1)
      existingSecret: "pg-tls-ca-cert"  (2)
      certSecretKey: "ca.crt"           (3)
  orchestratorDB:
    tls:
      enabled: true
      sslmode: "verify-full"
      existingSecret: "pg-tls-ca-cert"
      certSecretKey: "ca.crt"
yaml
1 verify-full validates the server’s hostname against the CA certificate. Use require to encrypt without certificate validation.
2 Name of the Kubernetes secret containing the CA certificate (created above).
3 Key within the secret that holds the CA certificate.

Create Databases

Kloudfuse requires the following databases to be created on your managed PostgreSQL instance before deployment (as of version 3.5.3):

PostgreSQL host Databases

global.configDB.host

configdb, alertsdb, beffedb, logsconfigdb, slodb, apmconfigdb, rbacdb, samldb, kfusesamldb, hydrationdb, logs_metadata, rumdb, deletiondb

global.orchestratorDB.host

orchestratordb, apmservicesdb

The Helm chart automatically creates all required databases during installation via an init job. The databases are created with OWNER set to the configured username, so the application user has full privileges within each database.

When global.fips.enabled: true, the database initialization uses the svc-ctl init job from the config-mgmt-service chart instead of the default init job. Ensure global.configDB.username and global.orchestratorDB.username are set so the databases are created with the correct owner.

Verify

After applying the Helm changes:

  • The kfuse-configdb pod should not be running.

  • The ingester and az-service pods should be in a Running state.

  • Check pod logs for database connection errors:

kubectl logs -l app=ingester --tail=50 | grep -i "postgres\|database\|connection"
kubectl logs -l app=az-service --tail=50 | grep -i "postgres\|database\|connection"

Managed Redis

Kloudfuse uses Redis for caching and session state across multiple services (ingester, query-service, logs-query-service, logs-parser, kfuse-auth, and others).

The bundled Redis deployment uses Sentinel for high availability (clients connect on port 26379). When switching to a managed Redis service, clients connect directly to the managed endpoint on port 6379 — no Sentinel is involved.

AWS ElastiCache and GCP Memorystore do not expose public endpoints — they are accessible only within the same VPC or from networks connected via VPC peering, Transit Gateway, or PrivateLink. If Kloudfuse is deployed outside the cloud provider’s network (for example, on GKE connecting to AWS ElastiCache), you must establish network connectivity to the Redis VPC before using this configuration.

Requirements

  • Redis 6.0 or later (Redis 7.x recommended for AWS ElastiCache and GCP Memorystore)

  • Non-clustered (single-node or replication) mode recommended for simplicity

  • Network accessible from the Kubernetes cluster (see note above about VPC-private endpoints)

  • Size the instance based on your workload — a starting point is 2 GB of memory for small deployments

Credentials

If your managed Redis instance requires authentication, create a Kubernetes secret containing the Redis password:

kubectl create secret generic kfuse-redis-credentials \
  --from-literal=redis-password="YOUR_REDIS_PASSWORD"

Replace YOUR_REDIS_PASSWORD with the actual password for your managed Redis instance.

Helm Configuration

Update your custom-values.yaml with the following parameters:

global:
  redis:
    enabled: false  (1)
    external:
      enabled: true  (2)
      host: "your-redis-endpoint.cache.amazonaws.com"  (3)
      port: 6379  (4)
yaml
1 Set to false to disable the bundled Redis StatefulSet (Sentinel).
2 Enable external Redis mode.
3 The DNS endpoint of your managed Redis instance.
4 Port of the external Redis (default: 6379).

When global.redis.external.enabled is true, all Kloudfuse services connect directly to the external endpoint instead of using Sentinel.

With Authentication

If your managed Redis requires a password, add the auth configuration:

global:
  redis:
    enabled: false
    external:
      enabled: true
      host: "your-redis-endpoint.cache.amazonaws.com"
      port: 6379
      auth:
        enabled: true  (1)
        existingSecret: "kfuse-redis-credentials"  (2)
yaml
1 Enable Redis authentication.
2 Name of the Kubernetes secret containing the Redis password (key: redis-password).

With TLS (In-Transit Encryption)

If your managed Redis has in-transit encryption enabled (required for AWS ElastiCache with TransitEncryptionEnabled: true), add the TLS configuration:

global:
  redis:
    enabled: false
    external:
      enabled: true
      host: "your-redis-endpoint.cache.amazonaws.com"
      port: 6379
      auth:
        enabled: true
        existingSecret: "kfuse-redis-credentials"
      tls:
        enabled: true  (1)
        skipVerify: false  (2)
yaml
1 Enable TLS for all Redis connections. All Kloudfuse services will connect using TLS.
2 Set to true to skip certificate verification (not recommended for production).
If ElastiCache has TransitEncryptionEnabled: true but tls.enabled is false in your values, services will hang on connection and eventually fail with context deadline exceeded — there is no explicit "TLS required" error message.

Verify

After applying the Helm changes:

  • Confirm the bundled kfuse-redis pods are not running:

    kubectl get pods | grep kfuse-redis
  • Verify that services can connect to the managed Redis endpoint by checking logs for connection errors:

    kubectl logs -l app=ingester --tail=50 | grep -i redis
    kubectl logs -l app=kfuse-auth --tail=50 | grep -i redis

Managed Kafka

Kloudfuse uses Kafka as the central streaming backbone. Approximately 13 services connect to Kafka for ingestion, transformation, querying, and storage.

The bundled Kafka deployment uses KRaft mode (no ZooKeeper) with plaintext connections on port 9092. Managed Kafka services typically require SASL authentication and TLS encryption.

Requirements

  • Apache Kafka 2.8 or later (KRaft-compatible)

  • Minimum 3 brokers for production

  • SASL/SCRAM-SHA-512 authentication (recommended for AWS MSK)

  • TLS encryption enabled

  • Auto-topic creation disabled (Kloudfuse manages its own topics)

  • Network accessible from the Kubernetes cluster

Cloud Providers

Provider Service Auth mechanism

AWS

Amazon MSK

SASL/SCRAM-SHA-512 (private: port 9096, public: port 9196)

Confluent

Confluent Cloud

SASL/PLAIN or SASL/OAUTHBEARER

Azure

Azure Event Hubs for Kafka

SASL/PLAIN

Credentials

Create a Kubernetes secret containing the SASL password:

kubectl create secret generic kfuse-kafka-sasl-credentials \
  --from-literal=kafka-sasl-password="YOUR_SASL_PASSWORD" \
  --from-literal=password="YOUR_SASL_PASSWORD"

Replace YOUR_SASL_PASSWORD with the actual password for your Kafka SASL user. Both keys are required — kafka-sasl-password is used by the init containers and password is used by the Pinot stream configs.

For AWS MSK with SCRAM-SHA-512, the password is the one associated with the secret stored in AWS Secrets Manager and linked to the MSK cluster via batch-associate-scram-secret.

Helm Configuration

Update your custom-values.yaml with the following parameters:

global:
  kafka:
    deployLegacy: false  (1)
    deployKraft: false  (2)
    bootstrapServers: "broker-1:9096,broker-2:9096,broker-3:9096"  (3)
    sasl:
      enabled: true  (4)
      mechanism: "SCRAM-SHA-512"  (5)
      username: "your-kafka-username"  (6)
      existingSecret: "kfuse-kafka-sasl-credentials"  (7)
      passwordKey: "kafka-sasl-password"  (8)
    tls:
      enabled: true  (9)
      skipVerify: false  (10)
yaml
1 Disable the legacy Kafka subchart.
2 Disable the KRaft Kafka subchart.
3 Comma-separated list of broker endpoints with ports. Setting bootstrapServers enables external/managed Kafka mode.
4 Enable SASL authentication.
5 The SASL mechanism. Use SCRAM-SHA-512 for AWS MSK, PLAIN for Confluent Cloud.
6 The SASL username.
7 Name of the Kubernetes secret containing the SASL password (created above).
8 Key within the secret that holds the password.
9 Enable TLS encryption (required for SASL/SCRAM on AWS MSK).
10 Set to true only for testing — not recommended for production.
The Helm chart validates that deployLegacy and deployKraft are both false when bootstrapServers is set. Deploying bundled Kafka alongside an external broker is not supported.

AWS MSK Example

For an AWS MSK cluster with 3 brokers in us-west-2:

global:
  kafka:
    deployLegacy: false
    deployKraft: false
    bootstrapServers: "b-1.mycluster.abc123.c14.kafka.us-west-2.amazonaws.com:9096,b-2.mycluster.abc123.c14.kafka.us-west-2.amazonaws.com:9096,b-3.mycluster.abc123.c14.kafka.us-west-2.amazonaws.com:9096"
    sasl:
      enabled: true
      mechanism: "SCRAM-SHA-512"
      username: "kfuse-admin"
      existingSecret: "kfuse-kafka-sasl-credentials"
      passwordKey: "kafka-sasl-password"
    tls:
      enabled: true
yaml

To find your MSK SASL bootstrap brokers:

# Private VPC endpoint (port 9096) — use when Kloudfuse is in the same VPC or connected via peering:
aws kafka get-bootstrap-brokers --cluster-arn <YOUR_MSK_CLUSTER_ARN> \
  --query 'BootstrapBrokerStringSaslScram' --output text

# Public endpoint (port 9196) — only if public access is enabled on the MSK cluster:
aws kafka get-bootstrap-brokers --cluster-arn <YOUR_MSK_CLUSTER_ARN> \
  --query 'BootstrapBrokerStringPublicSaslScram' --output text
AWS MSK exposes two SASL/SCRAM bootstrap endpoints. Use port 9096 for private VPC connectivity (recommended — Kloudfuse and MSK in the same VPC or connected via VPC peering/Transit Gateway). Use port 9196 for the public endpoint (requires the MSK cluster to have public access enabled with allow.everyone.if.no.acl.found=false).

Topic Creation

Kloudfuse requires a set of Kafka topics for each data stream (metrics, logs, traces, events, etc.). The required topics are defined in global.kafkaTopics in the Helm chart values.

When using managed Kafka, you have two options:

  1. Pre-create topics manually — Use kafka-topics.sh or your cloud provider’s console/CLI to create the required topics before deploying Kloudfuse. This is the recommended approach when your managed Kafka restricts topic creation via ACLs.

    First, create a client.properties file with your SASL/TLS credentials:

    security.protocol=SASL_SSL
    sasl.mechanism=SCRAM-SHA-512
    sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
      username="your-kafka-username" \
      password="your-kafka-password";
    properties

    Then create topics:

    kafka-topics.sh --bootstrap-server <YOUR_BOOTSTRAP_SERVERS> \
      --command-config /tmp/client.properties \
      --create --topic kf_metrics_topic --partitions 1 --replication-factor 3
  2. Automatic topic creation — The Helm chart includes a topic creator Job that attempts to create all required topics at install/upgrade time. Ensure the SASL user has the CREATE ACL permission on the cluster for this to succeed.

    Automatic topic creation is not available in FIPS mode (global.fips.enabled: true). The topic creator Job is disabled. You must pre-create all topics manually using Option 1 above.
Verify that all required topics exist after deployment. Missing topics will cause ingestion and transformation services to fail.

Verify

After applying the Helm changes:

  • The kafka-kraft and kafka-broker pods should not be running.

  • If not in FIPS mode, the topic creator job should complete successfully:

    kubectl get jobs | grep kafka-external-topic
    kubectl logs job/kfuse-kafka-external-topic-creation
    In FIPS mode (global.fips.enabled: true), the topic creator Job does not exist. Verify topics were created by running kafka-topics.sh --list from a temporary pod (see Topic Creation).
  • Verify services can connect to the external Kafka:

kubectl logs -l app=ingester --tail=50 | grep -i "kafka\|sasl\|bootstrap"
kubectl logs -l app=logs-transformer --tail=50 | grep -i "kafka\|sasl\|bootstrap"
  • Check that topics were created (using the client.properties file from the Topic Creation section):

kafka-topics.sh --bootstrap-server <YOUR_BOOTSTRAP_SERVERS> \
  --command-config /tmp/client.properties --list

Combined Configuration Example

To use managed PostgreSQL, Redis, and Kafka together (for example, in a multi-zone deployment):

installKfusePgCredentials: false

global:
  # --- Managed PostgreSQL ---
  configDB:
    host: "your-rds-endpoint.rds.amazonaws.com"
    username: "<your-db-username>"
  orchestratorDB:
    host: "your-rds-endpoint.rds.amazonaws.com"
    username: "<your-db-username>"
  # --- Managed Redis ---
  redis:
    enabled: false
    external:
      enabled: true
      host: "your-redis.cache.amazonaws.com"
      port: 6379
      auth:
        enabled: true
        existingSecret: "kfuse-redis-credentials"
  # --- Managed Kafka ---
  kafka:
    deployLegacy: false
    deployKraft: false
    bootstrapServers: "<YOUR_MSK_BOOTSTRAP_SERVERS>"  (1)
    sasl:
      enabled: true
      mechanism: "SCRAM-SHA-512"
      username: "kfuse-admin"
      existingSecret: "kfuse-kafka-sasl-credentials"
      passwordKey: "kafka-sasl-password"
    tls:
      enabled: true

kfuse-configdb:
  enabled: false

ingester:
  postgresql:
    enabled: false
yaml
1 Replace with your actual MSK bootstrap brokers (e.g., b-1.mycluster.abc123.c14.kafka.us-west-2.amazonaws.com:9096,…​).
All managed service settings must be under a single global: block. YAML does not merge duplicate keys — if you have multiple global: blocks in the same file, the last one overwrites the earlier ones.

Troubleshooting

Redis

Symptom Solution

Services fail with "connection refused" on port 26379

Verify global.redis.enabled: false and global.redis.external.enabled: true are set. When external mode is enabled, services connect directly on port 6379 instead of Sentinel port 26379.

Services fail with "connection refused" on port 6379

Check that the managed Redis endpoint is reachable from the Kubernetes cluster (security groups, VPC peering, firewall rules).

"Connection timed out" on port 6379 (no "connection refused")

AWS ElastiCache and GCP Memorystore have no public endpoints. Verify VPC peering, Transit Gateway, or PrivateLink is configured between your Kubernetes cluster network and the managed Redis VPC.

"NOAUTH Authentication required"

Your managed Redis requires a password. Set global.redis.external.auth.enabled: true and global.redis.external.auth.existingSecret to the name of the secret containing the Redis password (key: redis-password). See Credentials.

Kafka

Symptom Solution

"SASL authentication failed"

Verify the username and password match the credentials configured on your managed Kafka cluster. For AWS MSK, ensure the secret is associated with the cluster via batch-associate-scram-secret.

"TLS handshake failed" or "x509: certificate signed by unknown authority"

Ensure TLS is enabled (tls.enabled: true). If using a private CA, pods need the CA certificate mounted. Most managed services (AWS MSK, Confluent) use publicly trusted CAs.

Topic creator job fails with "authorization failed"

The SASL user needs the CREATE ACL permission on the Kafka cluster to create topics.

"Unable to determine Kafka cluster ID"

Verify the bootstrap servers are correct and reachable. For AWS MSK SASL/SCRAM, use port 9096 (private VPC) or 9196 (public endpoint), not 9092.

Helm validation error: "deployKraft must be false when global.kafka.bootstrapServers is set"

Set both global.kafka.deployLegacy: false and global.kafka.deployKraft: false when using external Kafka.

Services connect but produce/consume fail

Check that the required Kloudfuse topics exist. Review the topic creator job logs: kubectl logs job/kfuse-kafka-external-topic-creation.

General

If services fail to start after switching to managed services, check pod logs for connection errors:

kubectl get pods --field-selector=status.phase!=Running
kubectl describe pod <POD_NAME>
kubectl logs <POD_NAME> --tail=100

Ensure that network connectivity (VPC peering, security groups, firewall rules) allows the Kubernetes cluster to reach the managed service endpoints.