Workload Isolation with Node Pool Separation

Deploy Kloudfuse services onto dedicated node pools to isolate ingestion (write path), query (read path), and control plane workloads. Node pool separation provides resource isolation, improved stability, and independent scaling for each workload type.

Benefits

Resource isolation — Ingestion spikes do not impact query performance, and vice versa
Independent scaling — Scale ingestion, query, and control plane node pools separately based on workload
Improved stability — Noisy-neighbor effects are eliminated between write and read paths
Cost optimization — Right-size node pools for each workload type

Overview

Kloudfuse services are grouped into three roles:

Role Description

Role	Description
`ingestion`	Write path — data collection, parsing, and transformation
`query`	Read path — serving queries and the user interface
`control`	Control plane — cluster coordination and metadata

ingestion

Write path — data collection, parsing, and transformation

query

Read path — serving queries and the user interface

control

Control plane — cluster coordination and metadata

Each service has a default role assignment configured via its kfRole field in values.yaml.

DaemonSet workloads (such as the kfuse-observability-agent) run on all node pools and automatically receive merged tolerations and affinity rules.

Prerequisites

A Kubernetes cluster with three separate node pools — one for each role (ingestion, query, control).
Each node pool must have nodes labeled and tainted with the configured key and value.
Each role must have at least one node (numNodes must be greater than 0 for all three roles).

Step 1: Configure Node Pools

Create three node pools in your cloud provider. Each node pool must have a Kubernetes label and taint applied.

The default configuration uses the label key kf_role with values ingestion, query, and control.

GCP
AWS
Azure

For each node pool:

Go to Kubernetes Engine > Clusters > your cluster > Node Pools.
Create or edit a node pool.
In the Metadata section, add a Kubernetes label:
- Key: kf_role
- Value: ingestion (or query or control)
In the Taints section, add a taint:
- Key: kf_role
- Value: ingestion (or query or control)
- Effect: NoSchedule

For each node group:

Go to EKS > Clusters > your cluster > Node groups.
Create or edit a node group.
In Step 1: Configure node group, scroll to the Kubernetes labels section:
- Key: kf_role
- Value: ingestion (or query or control)
In the Kubernetes taints section, add a taint:
- Key: kf_role
- Value: ingestion (or query or control)
- Effect: NoSchedule

For each node pool:

Go to AKS > your cluster > Node pools.
Create or edit a node pool.
Under Optional settings, in the Labels section:
- Key: kf_role
- Value: ingestion (or query or control)
In the Taints section, add a taint:
- Key: kf_role
- Value: ingestion (or query or control)
- Effect: NoSchedule

Step 2: Configure Helm Values

In your custom_values.yaml, enable node pool separation and configure the number of nodes in each pool:

global:
  kfRoles:
    enabled: true
    ingestion:
      numNodes: 3  (1)
    query:
      numNodes: 3
    control:
      numNodes: 3

yaml

1	`numNodes` — The number of nodes in this pool. Used for replica count calculations. Must be greater than 0.

By default, the node label and taint key is kf_role, and the values are ingestion, query, and control. These defaults can be overridden for advanced use cases; see Customizing nodeKey and nodeValue.

When kfRoles.enabled is true:

Each service is automatically assigned to its designated node pool using node affinity rules.
Tolerations are automatically generated so pods can schedule onto tainted nodes.
global.nodeSelector is ignored (node affinity is used instead).
numNodes per pool replaces global.numNodes for replica count calculations.

Step 3: Install or Upgrade

Run the standard Helm install or upgrade command with your custom values:

helm upgrade --install kfuse oci://us-east1-docker.pkg.dev/mvp-demo-301906/kfuse-helm/kfuse \
  -n kfuse \
  --version <VERSION> \
  -f custom-values.yaml

1	`version`: Valid Kloudfuse release value; use the most recent one.

Verify Node Placement

After deployment, verify that pods are scheduled onto the correct node pools:

kubectl get pods -o wide -n kfuse

Check that:

Ingestion services (kafka, ingester, logs-transformer, etc.) run on nodes labeled kf_role=ingestion.
Query services (query-service, beffe, UI, etc.) run on nodes labeled kf_role=query.
Control plane services (Kafka KRaft controller, ZooKeeper) run on nodes labeled kf_role=control.
DaemonSets run on nodes across all pools.

You can list nodes by role:

kubectl get nodes -l kf_role=ingestion
kubectl get nodes -l kf_role=query
kubectl get nodes -l kf_role=control

Advanced Configuration

Customizing nodeKey and nodeValue

The nodeKey and nodeValue fields default to kf_role and ingestion/query/control respectively, but can be overridden for advanced use cases.

For example, if you run multiple Kloudfuse instances on the same Kubernetes cluster across different availability zones, override the nodeValue to isolate each instance:

global:
  kfRoles:
    enabled: true
    ingestion:
      nodeValue: "ingestion-az1"
      numNodes: 3
    query:
      nodeValue: "query-az1"
      numNodes: 3
    control:
      nodeValue: "control-az1"
      numNodes: 3

yaml

In this case, nodes must be labeled and tainted with the corresponding values (e.g., kf_role=ingestion-az1).

You can also change the nodeKey if your cluster uses a different label key:

global:
  kfRoles:
    enabled: true
    nodeKey: "workload-type"
    ingestion:
      nodeValue: "write"
      numNodes: 3
    query:
      nodeValue: "read"
      numNodes: 3
    control:
      nodeValue: "control"
      numNodes: 3

yaml

Overriding a Service’s Role

Each service has a default role assignment. You can override the role for any individual service in your custom_values.yaml:

redis:
  kfRole: query  (1)

yaml

1	Moves Redis from its default `ingestion` pool to the `query` pool.

Per-Service nodeSelector, Affinity, and Tolerations

If a service has local nodeSelector, affinity, or tolerations configured, those take priority over the auto-generated values from kfRoles. This allows fine-grained placement control for specific services:

query-service:
  nodeSelector:
    custom-label: "special-node"
  tolerations:
    - key: "custom-label"
      operator: "Equal"
      value: "special-node"
      effect: "NoSchedule"

yaml

Combining Roles on a Shared Node Pool

If you only need to isolate one workload type, you can combine the remaining roles onto a single shared node pool by pointing their nodeValue to the same label. For example, to separate only query services while running ingestion and control plane services together:

Create two node pools instead of three:
- A shared pool for ingestion and control, labeled and tainted with kf_role=ingestion-control.
- A dedicated pool for query, labeled and tainted with kf_role=query.

Configure kfRoles so that both ingestion and control use the same nodeValue:

global:
  kfRoles:
    enabled: true
    ingestion:
      nodeValue: "ingestion-control"  (1)
      numNodes: 4
    query:
      numNodes: 3
    control:
      nodeValue: "ingestion-control"  (1)
      numNodes: 4  (2)

yaml

1	Both `ingestion` and `control` point to the same `nodeValue`, so their pods schedule onto the same node pool.
2	Set `numNodes` to the same value for roles that share a node pool, reflecting the actual number of nodes in that shared pool.

This pattern works for any combination. For example, to isolate only ingestion and combine query with control:

global:
  kfRoles:
    enabled: true
    ingestion:
      numNodes: 4
    query:
      nodeValue: "query-control"
      numNodes: 3
    control:
      nodeValue: "query-control"
      numNodes: 3

yaml

Combining with Stream Isolation

Node pool separation works alongside Pinot Stream Isolation. When both features are enabled, Pinot pods receive both kfRole-based affinity (for node pool placement) and kf_stream-based affinity (for stream-specific node placement).