Kfuse FIPS Installation Guide

This guide covers the end-to-end steps for deploying Kloudfuse on EKS in a FIPS environment using AWS managed services (RDS PostgreSQL, ElastiCache Redis, MSK Kafka).

Prerequisites

Resource Requirements

EKS Cluster

Kubernetes 1.26+. Nodes must use FIPS-validated AMIs.

RDS PostgreSQL

PostgreSQL 18. Same VPC as EKS.

ElastiCache Redis

Redis 7.x. Non-clustered mode. In-transit encryption + auth token. Same VPC as EKS.

Amazon MSK

Kafka 3.7.x. kafka.m5.large or larger. SASL/SCRAM + TLS. Same VPC as EKS.

S3 Bucket

For Pinot deep store. IAM credentials or IRSA.

Variable Description

NAMESPACE

Kubernetes namespace for the Kloudfuse deployment.

PG_PASSWORD

Password for the PostgreSQL application user. You create this when setting up the RDS user.

REDIS_PASSWORD

Auth token for ElastiCache Redis. You define this when creating the ElastiCache cluster.

KAFKA_PASSWORD

SASL/SCRAM password for MSK. You define this when creating the MSK SCRAM secret in AWS Secrets Manager.

S3 accessKey / secretKey

AWS IAM access keys for S3 deep store access. Generated via aws iam create-access-key. Skip if using IRSA.

export NAMESPACE=<your-namespace>
export PG_PASSWORD=<your-password>
export REDIS_PASSWORD=<your-password>
export KAFKA_PASSWORD=<your-password>

Helm Registry Login

cat token.json | helm registry login -u _json_key --password-stdin us-east1-docker.pkg.dev

Namespace and Image Pull Secret

kubectl create namespace "$NAMESPACE"

kubectl create secret docker-registry kfuse-image-pull-credentials \
  --namespace="$NAMESPACE" \
  --docker-server='us.gcr.io' \
  --docker-username=_json_key \
  --docker-email='container-registry@mvp-demo-301906.iam.gserviceaccount.com' \
  --docker-password="$(cat token.json)"

Provision RDS PostgreSQL

Create Kubernetes Secret

PG_PASSWORD_ENCODED=$(python3 -c "import urllib.parse; print(urllib.parse.quote('$PG_PASSWORD', safe=''))")

kubectl create secret generic kfuse-pg-credentials \
  --namespace="$NAMESPACE" \
  --from-literal=postgres-password="$PG_PASSWORD" \
  --from-literal=postgresql-password="$PG_PASSWORD" \
  --from-literal=postgresql-password-encoded="$PG_PASSWORD_ENCODED" \
  --from-literal=postgresql-replication-password="$PG_PASSWORD"

curl -o rds-ca-bundle.pem https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem
kubectl create secret generic pg-tls-ca-cert \
  --namespace="$NAMESPACE" \
  --from-file=ca.crt=rds-ca-bundle.pem

Create RDS Instance

aws rds create-db-instance \
  --db-instance-identifier "<cluster-name>-pg" \
  --db-instance-class db.m5.large \
  --engine postgres \
  --engine-version "18.3" \
  --master-username postgres \
  --master-user-password "<rds-master-password>" \
  --allocated-storage 100 \
  --storage-type gp3 \
  --no-publicly-accessible \
  --storage-encrypted

Create Application Database User

Connect to RDS using the master user and create the application user:

psql "host=<rds-endpoint> user=postgres dbname=postgres sslmode=require"
CREATE USER <your-app-username> WITH PASSWORD '<app-user-password>' CREATEDB;
sql

Provision ElastiCache Redis

Create Kubernetes Secret

kubectl create secret generic kfuse-redis-credentials \
  --namespace="$NAMESPACE" \
  --from-literal=redis-password="$REDIS_PASSWORD"

Create ElastiCache Cluster

aws elasticache create-replication-group \
  --replication-group-id "<cluster-name>-redis" \
  --replication-group-description "Kloudfuse Redis" \
  --engine redis \
  --engine-version "7.1" \
  --cache-node-type cache.m5.large \
  --num-cache-clusters 1 \
  --transit-encryption-enabled \
  --auth-token "$REDIS_PASSWORD" \
  --at-rest-encryption-enabled

Provision MSK Cluster

Create Kubernetes Secret

kubectl create secret generic kfuse-kafka-sasl-credentials \
  --namespace="$NAMESPACE" \
  --from-literal=kafka-sasl-password="$KAFKA_PASSWORD" \
  --from-literal=password="$KAFKA_PASSWORD"

Create MSK Configuration

cat > /tmp/msk-config.properties << 'EOF'
auto.create.topics.enable=false
delete.topic.enable=true
log.retention.bytes=10737418240
message.max.bytes=20971520
num.io.threads=64
num.network.threads=64
num.partitions=1
num.recovery.threads.per.data.dir=8
transaction.state.log.replication.factor=3
transaction.state.log.min.isr=2
offsets.topic.replication.factor=2
allow.everyone.if.no.acl.found=true
EOF

aws kafka create-configuration \
  --name "<cluster-name>-msk-config" \
  --kafka-versions "3.7.x" \
  --server-properties fileb:///tmp/msk-config.properties

Create SCRAM Secret and Associate

The secret name must start with AmazonMSK_.

aws secretsmanager create-secret \
  --name "AmazonMSK_<cluster-name>-scram" \
  --secret-string "{\"username\":\"kfuse-admin\",\"password\":\"$KAFKA_PASSWORD\"}"

CLUSTER_ARN=$(aws kafka list-clusters-v2 \
  --query "ClusterInfoList[?ClusterName=='<cluster-name>'].ClusterArn" --output text)
SECRET_ARN=$(aws secretsmanager describe-secret \
  --secret-id "AmazonMSK_<cluster-name>-scram" --query 'ARN' --output text)

aws kafka batch-associate-scram-secret \
  --cluster-arn "$CLUSTER_ARN" \
  --secret-arn-list "$SECRET_ARN"

Get Bootstrap Servers

aws kafka get-bootstrap-brokers --cluster-arn "$CLUSTER_ARN" \
  --query 'BootstrapBrokerStringSaslScram' --output text

Use port 9096 (SASL/SCRAM). Do not use port 9092 (plaintext) or 9098 (IAM).

Configure S3 Deep Store

Skip this section if using IRSA for S3 access.

Create an IAM user with S3 read/write permissions for the Pinot deep store bucket, then generate access keys:

aws iam create-user --user-name kfuse-s3-user

aws iam put-user-policy --user-name kfuse-s3-user \
  --policy-name kfuse-s3-access \
  --policy-document '{
    "Version": "2012-10-17",
    "Statement": [{
      "Effect": "Allow",
      "Action": ["s3:GetObject","s3:PutObject","s3:DeleteObject","s3:ListBucket"],
      "Resource": ["arn:aws:s3:::<s3-bucket>","arn:aws:s3:::<s3-bucket>/*"]
    }]
  }'

aws iam create-access-key --user-name kfuse-s3-user

Use the AccessKeyId and SecretAccessKey from the output to create the Kubernetes secret:

kubectl create secret generic kfuse-az-service-s3 \
  --namespace="$NAMESPACE" \
  --from-literal=accessKey="<AccessKeyId>" \
  --from-literal=secretKey="<SecretAccessKey>"

Install Kloudfuse

values.yaml

global:
  orgId: "<your-org-id>"
  cloudProvider: "aws"

  configDB:
    host: "<rds-endpoint>"
    username: "<your-app-username>"
  orchestratorDB:
    host: "<rds-endpoint>"
    username: "<your-app-username>"

  redis:
    external:
      host: "<elasticache-endpoint>"

  kafka:
    bootstrapServers: "<broker-1>:9096,<broker-2>:9096,<broker-3>:9096"

  cloudStorage:
    type: s3
    useSecret: true
    secretName: kfuse-az-service-s3
    s3:
      region: "<aws-region>"
      bucket: "<s3-bucket>"

  observability:
    endpoint: "https://<observability-endpoint>"

  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: <node-label-key>
            operator: In
            values:
            - <node-label-value>
  tolerations:
  - key: "<taint-key>"
    operator: "Equal"
    value: "<taint-value>"
    effect: "NoSchedule"

# Per-service PostgreSQL username overrides
az-service:
  config:
    configdb:
      pgUser: "<your-app-username>"
    orchestratordb:
      pgUser: "<your-app-username>"

beffe:
  config:
    PG_USER: "<your-app-username>"

config-mgmt-service:
  config:
    configdb:
      user: "<your-app-username>"

ingester:
  config:
    rum:
      postgresdb:
        user: "<your-app-username>"
  postgresql:
    auth:
      username: "<your-app-username>"

pinot:
  deepStore:
    enabled: true
    prefix: "<namespace>/kfuse/controller/data"
  pgConfig:
    user: "<your-app-username>"

query-service:
  config:
    PG_USER: "<your-app-username>"
  advancefunctions:
    config:
      PG_USER: "<your-app-username>"
  rulemanager:
    config:
      PG_USER: "<your-app-username>"

trace-query-service:
  config:
    PG_USER: "<your-app-username>"

rum-query-service:
  config:
    defaultBackendVersion: v1

logs-query-service:
  config:
    MetadataDb:
      pgUser: "<your-app-username>"

logs-transformer:
  config:
    LogConfigDb:
      pgUser: "<your-app-username>"

trace-transformer:
  config:
    SpanConfigDb:
      pgUser: "<your-app-username>"

user-mgmt-service:
  config:
    PG_USER: "<your-app-username>"

zapper:
  config:
    pgConfig:
      user: "<your-app-username>"

grafana:
  grafana.ini:
    database:
      user: "<your-app-username>"
yaml

Deploy

helm upgrade --install kfuse \
  oci://us-east1-docker.pkg.dev/mvp-demo-301906/kfuse-helm/kfuse-fed \
  --namespace="$NAMESPACE" \
  --version <chart-version> \
  -f custom-values.yaml \
  --timeout 20m