Workload Isolation with Node Pool Separation
Deploy Kloudfuse services onto dedicated node pools to isolate ingestion (write path), query (read path), and control plane workloads. Node pool separation provides resource isolation, improved stability, and independent scaling for each workload type.
Benefits
-
Resource isolation — Ingestion spikes do not impact query performance, and vice versa
-
Independent scaling — Scale ingestion, query, and control plane node pools separately based on workload
-
Improved stability — Noisy-neighbor effects are eliminated between write and read paths
-
Cost optimization — Right-size node pools for each workload type
Overview
Kloudfuse services are grouped into three roles:
| Role | Description |
|---|---|
|
Write path — data collection, parsing, and transformation |
|
Read path — serving queries and the user interface |
|
Control plane — cluster coordination and metadata |
Each service has a default role assignment configured via its kfRole field in values.yaml.
DaemonSet workloads (such as the kfuse-observability-agent) run on all node pools and automatically receive merged tolerations and affinity rules.
Prerequisites
-
A Kubernetes cluster with three separate node pools — one for each role (ingestion, query, control).
-
Each node pool must have nodes labeled and tainted with the configured key and value.
-
Each role must have at least one node (
numNodesmust be greater than 0 for all three roles).
Step 1: Configure Node Pools
Create three node pools in your cloud provider. Each node pool must have a Kubernetes label and taint applied.
The default configuration uses the label key kf_role with values ingestion, query, and control.
-
GCP
-
AWS
-
Azure
For each node pool:
-
Go to Kubernetes Engine > Clusters > your cluster > Node Pools.
-
Create or edit a node pool.
-
In the Metadata section, add a Kubernetes label:
-
Key:
kf_role -
Value:
ingestion(orqueryorcontrol)
-
-
In the Taints section, add a taint:
-
Key:
kf_role -
Value:
ingestion(orqueryorcontrol) -
Effect:
NoSchedule
-
For each node group:
-
Go to EKS > Clusters > your cluster > Node groups.
-
Create or edit a node group.
-
In Step 1: Configure node group, scroll to the Kubernetes labels section:
-
Key:
kf_role -
Value:
ingestion(orqueryorcontrol)
-
-
In the Kubernetes taints section, add a taint:
-
Key:
kf_role -
Value:
ingestion(orqueryorcontrol) -
Effect:
NoSchedule
-
For each node pool:
-
Go to AKS > your cluster > Node pools.
-
Create or edit a node pool.
-
Under Optional settings, in the Labels section:
-
Key:
kf_role -
Value:
ingestion(orqueryorcontrol)
-
-
In the Taints section, add a taint:
-
Key:
kf_role -
Value:
ingestion(orqueryorcontrol) -
Effect:
NoSchedule
-
Step 2: Configure Helm Values
In your custom_values.yaml, enable node pool separation and configure the number of nodes in each pool:
global:
kfRoles:
enabled: true
ingestion:
numNodes: 3 (1)
query:
numNodes: 3
control:
numNodes: 3
| 1 | numNodes — The number of nodes in this pool. Used for replica count calculations. Must be greater than 0. |
By default, the node label and taint key is kf_role, and the values are ingestion, query, and control. These defaults can be overridden for advanced use cases; see Customizing nodeKey and nodeValue.
When kfRoles.enabled is true:
-
Each service is automatically assigned to its designated node pool using node affinity rules.
-
Tolerations are automatically generated so pods can schedule onto tainted nodes.
-
global.nodeSelectoris ignored (node affinity is used instead). -
numNodesper pool replacesglobal.numNodesfor replica count calculations.
Step 3: Install or Upgrade
Run the standard Helm install or upgrade command with your custom values:
helm upgrade --install kfuse oci://us-east1-docker.pkg.dev/mvp-demo-301906/kfuse-helm/kfuse \
-n kfuse \
--version <VERSION> \
-f custom-values.yaml
| 1 | version: Valid Kloudfuse release value; use the most recent one. |
Verify Node Placement
After deployment, verify that pods are scheduled onto the correct node pools:
kubectl get pods -o wide -n kfuse
Check that:
-
Ingestion services (kafka, ingester, logs-transformer, etc.) run on nodes labeled
kf_role=ingestion. -
Query services (query-service, beffe, UI, etc.) run on nodes labeled
kf_role=query. -
Control plane services (Kafka KRaft controller, ZooKeeper) run on nodes labeled
kf_role=control. -
DaemonSets run on nodes across all pools.
You can list nodes by role:
kubectl get nodes -l kf_role=ingestion
kubectl get nodes -l kf_role=query
kubectl get nodes -l kf_role=control
Advanced Configuration
Customizing nodeKey and nodeValue
The nodeKey and nodeValue fields default to kf_role and ingestion/query/control respectively, but can be overridden for advanced use cases.
For example, if you run multiple Kloudfuse instances on the same Kubernetes cluster across different availability zones, override the nodeValue to isolate each instance:
global:
kfRoles:
enabled: true
ingestion:
nodeValue: "ingestion-az1"
numNodes: 3
query:
nodeValue: "query-az1"
numNodes: 3
control:
nodeValue: "control-az1"
numNodes: 3
In this case, nodes must be labeled and tainted with the corresponding values (e.g., kf_role=ingestion-az1).
You can also change the nodeKey if your cluster uses a different label key:
global:
kfRoles:
enabled: true
nodeKey: "workload-type"
ingestion:
nodeValue: "write"
numNodes: 3
query:
nodeValue: "read"
numNodes: 3
control:
nodeValue: "control"
numNodes: 3
Overriding a Service’s Role
Each service has a default role assignment. You can override the role for any individual service in your custom_values.yaml:
redis:
kfRole: query (1)
| 1 | Moves Redis from its default ingestion pool to the query pool. |
Per-Service nodeSelector, Affinity, and Tolerations
If a service has local nodeSelector, affinity, or tolerations configured, those take priority over the auto-generated values from kfRoles. This allows fine-grained placement control for specific services:
query-service:
nodeSelector:
custom-label: "special-node"
tolerations:
- key: "custom-label"
operator: "Equal"
value: "special-node"
effect: "NoSchedule"
Combining Roles on a Shared Node Pool
If you only need to isolate one workload type, you can combine the remaining roles onto a single shared node pool by pointing their nodeValue to the same label. For example, to separate only query services while running ingestion and control plane services together:
-
Create two node pools instead of three:
-
A shared pool for ingestion and control, labeled and tainted with
kf_role=ingestion-control. -
A dedicated pool for query, labeled and tainted with
kf_role=query.
-
-
Configure
kfRolesso that bothingestionandcontroluse the samenodeValue:global: kfRoles: enabled: true ingestion: nodeValue: "ingestion-control" (1) numNodes: 4 query: numNodes: 3 control: nodeValue: "ingestion-control" (1) numNodes: 4 (2)yaml1 Both ingestionandcontrolpoint to the samenodeValue, so their pods schedule onto the same node pool.2 Set numNodesto the same value for roles that share a node pool, reflecting the actual number of nodes in that shared pool.
This pattern works for any combination. For example, to isolate only ingestion and combine query with control:
global:
kfRoles:
enabled: true
ingestion:
numNodes: 4
query:
nodeValue: "query-control"
numNodes: 3
control:
nodeValue: "query-control"
numNodes: 3
Combining with Stream Isolation
Node pool separation works alongside Pinot Stream Isolation. When both features are enabled, Pinot pods receive both kfRole-based affinity (for node pool placement) and kf_stream-based affinity (for stream-specific node placement).