IOPS Tuning for PVCs on Azure

Kloudfuse relies on Kubernetes Persistent Volume Claims (PVCs) for several stateful components. On Azure, the backing storage is Azure Managed Disks. The disk type and IOPS configuration you choose directly controls the throughput available to each component. Under-provisioned disks are a common cause of ingestion lag, Pinot segment replication delays, and Kafka consumer lag.

The Pinot deep store (Azure Blob Storage / ADLS) is separate from managed disk PVCs. This page covers local disk-backed PVCs used by Pinot servers, Kafka brokers, ZooKeeper, and PostgreSQL.

IOPS-Sensitive Components

Component Workload Pattern Recommendation

Pinot (server, controller)

High random read; sequential segment writes

Premium SSD v2 with provisioned IOPS, or Ultra Disk for high-concurrency clusters

Kafka

Sequential write-heavy; sequential read for replay

Premium SSD or Premium SSD v2

PostgreSQL (configdb, orchestratordb)

Random read/write, low volume

Premium SSD at default settings; increase IOPS for busy clusters

ZooKeeper (Pinot, Kafka)

Small random read/write; latency-sensitive

Premium SSD or Premium SSD v2

Azure Managed Disk Types

Disk Type Max IOPS Max Throughput Use Case

Standard HDD

Up to 2,000

Up to 500 MB/s

Dev/test only. Not suitable for production Pinot or Kafka.

Standard SSD

Up to 6,000

Up to 750 MB/s

Light workloads. Use Premium SSD instead for production.

Premium SSD (P-series)

Up to 20,000

Up to 900 MB/s

Default recommendation for Kafka, PostgreSQL, and ZooKeeper.

Premium SSD v2

Up to 80,000 (provisioned)

Up to 1,200 MB/s (provisioned)

Recommended for Pinot servers; decouples IOPS and throughput from disk size.

Ultra Disk

Up to 400,000 (provisioned)

Up to 10,000 MB/s (provisioned)

Maximum performance for large multi-AZ Pinot deployments; highest cost.

Premium SSD v2 and Ultra Disk allow IOPS and throughput to be provisioned independently of disk size — similar to AWS gp3 and io2. Premium SSD (v1) IOPS scale with the disk tier (P-series size).

Prerequisites

The Azure Disk CSI driver must be installed in your AKS cluster. It is required to provision Premium SSD v2 and Ultra Disk volumes and to use the skuName, diskIOPSReadWrite, and diskMBpsReadWrite StorageClass parameters.

# Verify the Azure Disk CSI driver is enabled on your AKS cluster
az aks show \
  --name <cluster-name> \
  --resource-group <resource-group> \
  --query "storageProfile.diskCSIDriver.enabled"

If the CSI driver is not enabled:

az aks update \
  --name <cluster-name> \
  --resource-group <resource-group> \
  --enable-disk-driver

Ultra Disk requires the AKS node pool to be created with Ultra Disk support enabled (--enable-ultra-ssd). It cannot be enabled on existing node pools without recreation. Premium SSD v2 has similar availability zone restrictions — verify regional availability before use.

Define StorageClasses

Create StorageClass manifests and reference them in your Helm values.

Premium SSD — Default

Suitable for Kafka, PostgreSQL, and ZooKeeper. IOPS and throughput scale with the P-series tier (disk size).

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: kfuse-premium-ssd
provisioner: disk.csi.azure.com
parameters:
  skuName: Premium_LRS
  cachingMode: ReadOnly
  kind: Managed
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Retain
yaml

Premium SSD v2 — Provisioned IOPS

For Pinot servers and ZooKeeper that need consistent IOPS independent of disk size.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: kfuse-premium-ssd-v2
provisioner: disk.csi.azure.com
parameters:
  skuName: PremiumV2_LRS
  diskIOPSReadWrite: "6000"
  diskMBpsReadWrite: "300"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Retain
yaml

Ultra Disk — Maximum Performance

For high-concurrency Pinot deployments where consistent low latency is required at scale.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: kfuse-ultra-disk
provisioner: disk.csi.azure.com
parameters:
  skuName: UltraSSD_LRS
  diskIOPSReadWrite: "16000"
  diskMBpsReadWrite: "500"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Retain
yaml

Ultra Disk incurs a per-IOPS and per-MB/s charge in addition to the capacity charge. Set diskIOPSReadWrite and diskMBpsReadWrite to the minimum required for your workload. Over-provisioning significantly increases cost. Premium SSD v2 is often a more cost-effective alternative for most Pinot deployments.

Configure Helm Values

Apply the StorageClasses to each component in your custom_values.yaml:

pinot:
  server:
    persistence:
      storageClass: kfuse-premium-ssd-v2
      size: 500Gi
  controller:
    persistence:
      storageClass: kfuse-premium-ssd-v2
      size: 100Gi
  zookeeper:
    persistence:
      storageClass: kfuse-premium-ssd-v2
      size: 20Gi

kafka:
  persistence:
    storageClass: kfuse-premium-ssd
    size: 200Gi
  zookeeper:
    persistence:
      storageClass: kfuse-premium-ssd
      size: 20Gi

kfuse-configdb:
  primary:
    persistence:
      storageClass: kfuse-premium-ssd
      size: 50Gi
yaml
Use volumeBindingMode: WaitForFirstConsumer in every StorageClass. This ensures Azure Managed Disks are created in the same Availability Zone as the pod, avoiding cross-AZ disk attachment failures.

IOPS Sizing Guidelines

For Premium SSD v2, IOPS are provisioned independently of disk size. The baseline is 3,000 IOPS and 125 MB/s; additional IOPS are provisioned in the StorageClass.

For Premium SSD (v1), IOPS scale with the disk tier. Use the P-series size table to back-calculate the minimum disk size for your IOPS target:

P-series Tier Disk Size Max IOPS Max Throughput

P10

128 GiB

500

100 MB/s

P20

512 GiB

2,300

150 MB/s

P30

1 TiB

5,000

200 MB/s

P40

2 TiB

7,500

250 MB/s

P50

4 TiB

7,500

250 MB/s

P60

8 TiB

16,000

500 MB/s

Use these rules of thumb to right-size provisioned IOPS for Premium SSD v2 and Ultra Disk:

Component Starting IOPS When to Increase

Pinot server

3,000–6,000

Segment replication lag, high query latency

Pinot controller

3,000

Rarely needs more

Kafka broker

3,000

Consumer lag, high producer throughput (>200 MB/s)

ZooKeeper

3,000

Watch event storms, leader election delays

PostgreSQL

3,000

Slow alert rule evaluation, slow config API

Monitoring Disk Performance

Monitor Azure Managed Disk performance from the Azure Portal under Disks, or query Azure Monitor with the following metrics:

  • Disk Read Operations/Sec / Disk Write Operations/Sec — IOPS consumed

  • Disk Read Bytes/Sec / Disk Write Bytes/Sec — throughput consumed

  • Disk Read Latency / Disk Write Latency — per-operation latency

A sustained disk read or write latency above 5ms, or IOPS consistently at the provisioned limit, indicates the volume is approaching saturation and IOPS should be increased.

You can also inspect Pinot server lag and segment replication metrics from the Pinot control plane dashboard.