IOPS Tuning for PVCs on GCP
Kloudfuse relies on Kubernetes Persistent Volume Claims (PVCs) for several stateful components. On GCP, the backing storage is GCP Persistent Disk. The disk type you choose directly controls the IOPS and throughput available to each component. Under-provisioned storage is one of the most common causes of ingestion lag, query latency, and Pinot segment replication delays.
IOPS-Sensitive Components
| Component | Workload Pattern | Recommendation |
|---|---|---|
Pinot (server, controller) |
High random read; sequential segment writes |
|
Kafka |
Sequential write-heavy; sequential read for replay |
|
PostgreSQL ( |
Random read/write, low volume |
|
ZooKeeper (Pinot, Kafka) |
Small random read/write; latency-sensitive |
|
GCP Persistent Disk Types
| Disk Type | IOPS | Throughput | Use Case |
|---|---|---|---|
|
~0.75 read / 1.5 write IOPS per GB (HDD) |
~120 MB/s |
Dev/test only. Not suitable for production Pinot or Kafka. |
|
6 read / 6 write IOPS per GB (SSD) |
~240 MB/s |
Suitable for Kafka and PostgreSQL in most deployments. |
|
30 read / 30 write IOPS per GB (SSD) |
~480 MB/s |
Recommended for Pinot servers and ZooKeeper. |
|
Provisioned (up to 160,000 IOPS) |
Provisioned (up to 2,400 MB/s) |
High-throughput Pinot clusters; allows decoupling IOPS from disk size. |
|
Provisioned (up to 350,000 IOPS) |
Provisioned (up to 5,000 MB/s) |
Extreme workloads only; higher cost. |
IOPS for pd-standard, pd-balanced, and pd-ssd scale with disk size. A 1 TB pd-ssd provides up to 30,000 read IOPS. Hyperdisk types let you set IOPS and throughput independently of disk size.
|
Define StorageClasses
Create StorageClass manifests for each disk type you need, then reference them in your Helm values.
pd-ssd StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: kfuse-pd-ssd
provisioner: pd.csi.storage.gke.io
parameters:
type: pd-ssd
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Retain
pd-balanced StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: kfuse-pd-balanced
provisioner: pd.csi.storage.gke.io
parameters:
type: pd-balanced
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Retain
hyperdisk-balanced StorageClass
Use this when you need to provision IOPS independently of disk size. Set provisioned-iops-on-create and provisioned-throughput-on-create to match your workload requirements.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: kfuse-hyperdisk-balanced
provisioner: pd.csi.storage.gke.io
parameters:
type: hyperdisk-balanced
provisioned-iops-on-create: "10000"
provisioned-throughput-on-create: "500Mi"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Retain
|
|
Configure Helm Values
Apply the StorageClasses to each component in your custom_values.yaml:
pinot:
server:
persistence:
storageClass: kfuse-pd-ssd
size: 500Gi
controller:
persistence:
storageClass: kfuse-pd-ssd
size: 100Gi
zookeeper:
persistence:
storageClass: kfuse-pd-ssd
size: 20Gi
kafka:
persistence:
storageClass: kfuse-pd-balanced
size: 200Gi
zookeeper:
persistence:
storageClass: kfuse-pd-ssd
size: 20Gi
kfuse-configdb:
primary:
persistence:
storageClass: kfuse-pd-balanced
size: 50Gi
Use WaitForFirstConsumer volume binding mode (set in the StorageClass) to ensure PVCs are provisioned in the same zone as the pod. This avoids cross-zone I/O latency.
|
Disk Sizing Guidelines
For pd-ssd, IOPS scale at 30 read / 30 write IOPS per GB. Use this to back-calculate the minimum disk size needed to hit your IOPS target:
required_GB = target_IOPS / 30
Example: to sustain 9,000 read IOPS on a Pinot server, provision at least a 300 GB pd-ssd volume.
For Hyperdisk, size the disk for capacity only and set IOPS/throughput independently via the StorageClass parameters or via a VolumeAttributesClass if using dynamic provisioning.
Monitoring Disk Performance
Check actual disk IOPS and throughput from the GCP Console under Compute Engine → Disks, or query Cloud Monitoring with the following metrics:
-
compute.googleapis.com/instance/disk/read_ops_count -
compute.googleapis.com/instance/disk/write_ops_count -
compute.googleapis.com/instance/disk/read_bytes_count -
compute.googleapis.com/instance/disk/write_bytes_count
You can also inspect Pinot server lag and segment replication metrics from the Pinot control plane dashboard.