IOPS Tuning for PVCs on GCP

Kloudfuse relies on Kubernetes Persistent Volume Claims (PVCs) for several stateful components. On GCP, the backing storage is GCP Persistent Disk. The disk type you choose directly controls the IOPS and throughput available to each component. Under-provisioned storage is one of the most common causes of ingestion lag, query latency, and Pinot segment replication delays.

IOPS-Sensitive Components

Component Workload Pattern Recommendation

Pinot (server, controller)

High random read; sequential segment writes

pd-ssd or hyperdisk-balanced

Kafka

Sequential write-heavy; sequential read for replay

pd-balanced or pd-ssd

PostgreSQL (configdb, orchestratordb)

Random read/write, low volume

pd-balanced is usually sufficient; use pd-ssd for busy clusters

ZooKeeper (Pinot, Kafka)

Small random read/write; latency-sensitive

pd-ssd

GCP Persistent Disk Types

Disk Type IOPS Throughput Use Case

pd-standard

~0.75 read / 1.5 write IOPS per GB (HDD)

~120 MB/s

Dev/test only. Not suitable for production Pinot or Kafka.

pd-balanced

6 read / 6 write IOPS per GB (SSD)

~240 MB/s

Suitable for Kafka and PostgreSQL in most deployments.

pd-ssd

30 read / 30 write IOPS per GB (SSD)

~480 MB/s

Recommended for Pinot servers and ZooKeeper.

hyperdisk-balanced

Provisioned (up to 160,000 IOPS)

Provisioned (up to 2,400 MB/s)

High-throughput Pinot clusters; allows decoupling IOPS from disk size.

hyperdisk-extreme

Provisioned (up to 350,000 IOPS)

Provisioned (up to 5,000 MB/s)

Extreme workloads only; higher cost.

IOPS for pd-standard, pd-balanced, and pd-ssd scale with disk size. A 1 TB pd-ssd provides up to 30,000 read IOPS. Hyperdisk types let you set IOPS and throughput independently of disk size.

Define StorageClasses

Create StorageClass manifests for each disk type you need, then reference them in your Helm values.

pd-ssd StorageClass

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: kfuse-pd-ssd
provisioner: pd.csi.storage.gke.io
parameters:
  type: pd-ssd
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Retain
yaml

pd-balanced StorageClass

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: kfuse-pd-balanced
provisioner: pd.csi.storage.gke.io
parameters:
  type: pd-balanced
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Retain
yaml

hyperdisk-balanced StorageClass

Use this when you need to provision IOPS independently of disk size. Set provisioned-iops-on-create and provisioned-throughput-on-create to match your workload requirements.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: kfuse-hyperdisk-balanced
provisioner: pd.csi.storage.gke.io
parameters:
  type: hyperdisk-balanced
  provisioned-iops-on-create: "10000"
  provisioned-throughput-on-create: "500Mi"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Retain
yaml

provisioned-throughput-on-create uses binary SI notation (Mi = MiB/s). Adjust both values based on your expected segment write rate and query concurrency.

Configure Helm Values

Apply the StorageClasses to each component in your custom_values.yaml:

pinot:
  server:
    persistence:
      storageClass: kfuse-pd-ssd
      size: 500Gi
  controller:
    persistence:
      storageClass: kfuse-pd-ssd
      size: 100Gi
  zookeeper:
    persistence:
      storageClass: kfuse-pd-ssd
      size: 20Gi

kafka:
  persistence:
    storageClass: kfuse-pd-balanced
    size: 200Gi
  zookeeper:
    persistence:
      storageClass: kfuse-pd-ssd
      size: 20Gi

kfuse-configdb:
  primary:
    persistence:
      storageClass: kfuse-pd-balanced
      size: 50Gi
yaml
Use WaitForFirstConsumer volume binding mode (set in the StorageClass) to ensure PVCs are provisioned in the same zone as the pod. This avoids cross-zone I/O latency.

Disk Sizing Guidelines

For pd-ssd, IOPS scale at 30 read / 30 write IOPS per GB. Use this to back-calculate the minimum disk size needed to hit your IOPS target:

required_GB = target_IOPS / 30

Example: to sustain 9,000 read IOPS on a Pinot server, provision at least a 300 GB pd-ssd volume.

For Hyperdisk, size the disk for capacity only and set IOPS/throughput independently via the StorageClass parameters or via a VolumeAttributesClass if using dynamic provisioning.

Monitoring Disk Performance

Check actual disk IOPS and throughput from the GCP Console under Compute Engine → Disks, or query Cloud Monitoring with the following metrics:

  • compute.googleapis.com/instance/disk/read_ops_count

  • compute.googleapis.com/instance/disk/write_ops_count

  • compute.googleapis.com/instance/disk/read_bytes_count

  • compute.googleapis.com/instance/disk/write_bytes_count

You can also inspect Pinot server lag and segment replication metrics from the Pinot control plane dashboard.