Node Affinity and Tolerations

Table of Contents

Scheduling Concepts
Kloudfuse
GKE Limitation: kube-system Pods and Node Taints

Node affinity and tolerations control which Kubernetes nodes Kloudfuse pods are allowed to schedule onto. Use these settings to constrain the entire Kloudfuse deployment to a specific node group, availability zone, or set of nodes that match your infrastructure requirements.

If your goal is to separate Kloudfuse workloads across multiple dedicated node pools (ingestion, query, control), use Workload Isolation instead. The settings documented here apply uniformly to all Kloudfuse pods.

Scheduling Concepts

Node Affinity

Node affinity is a set of rules that the Kubernetes scheduler uses when placing pods. It extends nodeSelector with richer expression syntax and two enforcement modes:

Mode Behavior

Mode	Behavior
`requiredDuringSchedulingIgnoredDuringExecution`	Hard requirement — the pod will not be scheduled unless the node matches the expression. Use this to strictly enforce placement on specific nodes.
`preferredDuringSchedulingIgnoredDuringExecution`	Soft preference — the scheduler tries to place the pod on matching nodes but falls back to other nodes if none are available.

requiredDuringSchedulingIgnoredDuringExecution

Hard requirement — the pod will not be scheduled unless the node matches the expression. Use this to strictly enforce placement on specific nodes.

preferredDuringSchedulingIgnoredDuringExecution

Soft preference — the scheduler tries to place the pod on matching nodes but falls back to other nodes if none are available.

Using preferredDuringSchedulingIgnoredDuringExecution has implications for the Helm chart’s pre-install validation. See Helm Chart Validation Limitations.

Node affinity expressions use matchExpressions to filter nodes by label:

Operator Meaning

Operator	Meaning
`In`	Node label value must be one of the specified values.
`NotIn`	Node label value must not be any of the specified values.
`Exists`	Node must have the label (any value).
`DoesNotExist`	Node must not have the label.
`Gt`, `Lt`	Node label value must be numerically greater than or less than the specified value.

In

Node label value must be one of the specified values.

NotIn

Node label value must not be any of the specified values.

Exists

Node must have the label (any value).

DoesNotExist

Node must not have the label.

Gt, Lt

Node label value must be numerically greater than or less than the specified value.

Tolerations

Nodes can be tainted to repel pods that do not explicitly tolerate the taint. A taint has three components: a key, a value, and an effect. The two most common effects are:

NoSchedule — pods without a matching toleration are not scheduled onto the node.
NoExecute — existing pods without a matching toleration are evicted, and new pods are not scheduled.

A toleration in the pod spec matches a taint and allows the pod to schedule onto that node despite the taint. When combined with node affinity, you can both attract pods to specific nodes and permit them to schedule onto tainted nodes in one configuration.

Managing Node Labels/Taints

Use the tabs below to apply labels and taints via your cloud provider CLI, or directly with kubectl on individual nodes.

GCP (GKE)
AWS (EKS)
Azure (AKS)
kubectl

# Label the node pool
gcloud container node-pools update <node-pool-name> \
  --cluster=<cluster-name> \
  --node-labels=ng_label=az2

# Taint the node pool
gcloud container node-pools update <node-pool-name> \
  --cluster=<cluster-name> \
  --node-taints=ng_taint=az2:NoSchedule

Configure labels and taints when creating the managed node group, or apply them to existing nodes:

# Label nodes in the node group
kubectl label nodes -l eks.amazonaws.com/nodegroup=<ng-name> ng_label=az2

# Taint nodes in the node group
kubectl taint nodes -l eks.amazonaws.com/nodegroup=<ng-name> ng_taint=az2:NoSchedule

Configure labels and taints in the node pool settings, or apply them to existing nodes:

# Label nodes in the node pool
kubectl label nodes -l agentpool=<pool-name> ng_label=az2

# Taint nodes in the node pool
kubectl taint nodes -l agentpool=<pool-name> ng_taint=az2:NoSchedule

Apply labels and taints directly to individual nodes using kubectl:

Labels and taints applied with kubectl are lost if a node is replaced or recycled by the cloud provider autoscaler. For persistent configuration, set labels and taints in the node pool or node group definition in your cloud provider.

# Add a label to a node
kubectl label node <node-name> ng_label=az2

# Add a label to all nodes matching a selector
kubectl label nodes -l <existing-label>=<value> ng_label=az2

# Remove a label from a node (append - to the key)
kubectl label node <node-name> ng_label-

# List all nodes with a specific label
kubectl get nodes -l ng_label=az2

# Show all labels on a node
kubectl get node <node-name> --show-labels

# Add a NoSchedule taint to a node
kubectl taint node <node-name> ng_taint=az2:NoSchedule

# Add a taint to all nodes matching a label selector
kubectl taint nodes -l ng_label=az2 ng_taint=az2:NoSchedule

# Remove a taint from a node (append - to the taint spec)
kubectl taint node <node-name> ng_taint=az2:NoSchedule-

# Verify taints on a node
kubectl describe node <node-name> | grep -A5 Taints

Kloudfuse

Global Affinity

Set global.affinity to a standard Kubernetes affinity object. The following example uses a hard node affinity requirement:

global:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: <node-label-key>
                operator: In
                values:
                  - <node-label-value>

yaml

Global Tolerations

Set global.tolerations to a list of standard Kubernetes toleration objects:

global:
  tolerations:
    - key: "<taint-key>"
      operator: "Equal"
      value: "<taint-value>"
      effect: "NoSchedule"

yaml

Per-Service Overrides

If a service has local affinity or tolerations values set in custom_values.yaml, those values take priority over the global settings. Use this to place an individual component on a different set of nodes:

redis:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: ng_label
                operator: In
                values:
                  - memory-optimized
  tolerations:
    - key: "workload"
      operator: "Equal"
      value: "memory-optimized"
      effect: "NoSchedule"

yaml

Example:

A common use case is deploying Kloudfuse onto a dedicated node group within a larger shared cluster, where the node group is labeled and tainted to prevent general workloads from running there.

In the example below, all Kloudfuse pods are constrained to nodes labeled ng_label=az2. The toleration allows pods to schedule onto nodes that carry the ng_taint=az2:NoSchedule taint:

global:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: ng_label
                operator: In
                values:
                  - az2
  tolerations:
    - key: "ng_taint"
      operator: "Equal"
      value: "az2"
      effect: "NoSchedule"

yaml

Before applying this configuration, ensure that the target nodes are labeled and tainted accordingly:

Helm Validation

The Kloudfuse Helm chart runs a pre-install validation that inspects the live cluster to determine how many nodes are available to Kloudfuse. This validation drives replica sizing warnings and pre-flight checks. Its evaluation of global.affinity has two limitations.

Only requiredDuringSchedulingIgnoredDuringExecution is evaluated. The validation parses nodeSelectorTerms under requiredDuringSchedulingIgnoredDuringExecution to filter the node list. If you set preferredDuringSchedulingIgnoredDuringExecution instead, the entire affinity block is skipped and all cluster nodes are counted as available Kloudfuse nodes. The resulting replica sizing warnings may be inaccurate. Use requiredDuringSchedulingIgnoredDuringExecution whenever possible to ensure the validation reflects your actual node topology.

Only In and NotIn operators are evaluated. The validation evaluates matchExpressions entries with the In and NotIn operators. Entries using Exists, DoesNotExist, Gt, or Lt are silently ignored — nodes are not filtered based on those expressions, so the validation may again over-count available nodes.

Neither limitation affects runtime scheduling. Kubernetes evaluates all affinity modes and operators correctly when placing pods; only the Helm pre-install validation is affected.

Node Pool Separation

global.affinity and global.tolerations apply uniformly to every Kloudfuse pod. They are the right choice when you want to contain the entire Kloudfuse deployment within a specific node group or availability zone.

For splitting Kloudfuse workloads across multiple dedicated node pools by workload type (ingestion, query, and control plane), use Workload Isolation instead. That feature uses the kfRoles mechanism to automatically generate per-service affinity and tolerations based on each service’s role assignment.

When both global.affinity and kfRoles are configured, kfRoles takes precedence for services that have a role assignment.

GKE Limitation: kube-system Pods and Node Taints

On GKE, kube-system pods (such as kube-dns, metrics-server, and GKE add-ons) do not carry tolerations for user-defined taints. If you taint every node in the cluster with a custom NoSchedule taint, these system pods cannot schedule and will be evicted from their current nodes, potentially breaking cluster DNS and monitoring.

To avoid this, ensure at least one node pool in your cluster remains untainted so that kube-system pods have nodes available to run on. A typical pattern is:

A Kloudfuse node pool — labeled and tainted with your custom key/value (e.g., ng_taint=az2:NoSchedule). Kloudfuse pods target this pool via affinity and toleration.
A system node pool — no custom taints, used exclusively for kube-system and other cluster infrastructure workloads.

GKE’s documentation on node taints covers this behavior in detail, including the effect of taints on existing kube-system pods: Node taints — Google Kubernetes Engine

If you add a NoSchedule taint to an existing node pool that already has kube-system pods running on it, those pods are evicted immediately. Verify that your cluster has an untainted node pool before adding taints to a GKE node pool.