Query Priorities Concepts
Kloudfuse uses query priorities and per-call timeouts to govern how much of the underlying analytics engine (Apache Pinot) any given query is allowed to consume. Together they let operators fast-path interactive dashboard traffic, ration the budget given to background machine traffic, and protect the cluster from runaway alert evaluation.
This page explains what query priorities are, what they affect, and how Kloudfuse decides which value to apply on each request. For step-by-step configuration, see:
-
Configure Cluster-Wide Defaults — set cluster-wide defaults that apply to everyone.
-
Override Priorities for a User, Team, or Service Account — override the defaults for specific users, teams, and service accounts.
What Query Priorities Affect
Every query that the Kloudfuse query services send to Pinot carries two options:
| Option | Effect |
|---|---|
|
Tells the Pinot broker how to schedule the query relative to others. Higher priority queries are admitted first when the broker is busy. Three priorities are available: High, Medium (the default), and Low. |
|
The wall-clock budget for that single Pinot call, in milliseconds. Queries that exceed the budget are aborted and return an error to the caller. |
Both values are resolved per request. The same user can issue a high-priority interactive dashboard query and a low-priority background export from the same browser session, and Kloudfuse will give each the appropriate slot.
Classes
Every request belongs to one of three classes based on how it reached Kloudfuse:
- Scheduled
-
Background work the platform itself triggers — Grafana alert rule evaluation, the in-process rule manager, and periodically refreshing scheduled views. These never carry user identity, so they get their priority and timeout from cluster-wide defaults rather than from a user’s policy.
- Interactive
-
Browser traffic. Cookie / SSO / Basic-auth requests classify as interactive. This is the class your end users hit when they open a dashboard or run an ad-hoc query.
- Machine
-
Programmatic traffic. Bearer tokens, MCP-OAuth requests, and service account calls all classify as machine. Service accounts can only configure values for the machine class.
The class is decided by user-mgmt-service at authentication time and travels with the request. Service detail pages only expose the Machine class; user and team detail pages expose both Interactive and Machine. The Scheduled class is system-only — it is not configurable per-entity and only appears on the cluster-wide Global query priorities page.
Streams
Each class has per-stream entries so different telemetry types can be tuned independently:
-
APM (
apm) -
Events (
events) -
Logs (
logs) -
Metrics (
metrics) -
RUM (
rum) -
All streams (the wildcard
*)
Most operators configure only the wildcard row; the per-stream rows let you make specific streams faster or slower than the rest. For example, you can give Logs queries a longer timeout than Metrics queries while keeping a single shared priority.
Resolution Precedence
When a request arrives, Kloudfuse walks the following sources in order and picks the first value it finds for each of schedulerPriority and timeoutMs:
-
If the request is in the Scheduled class: the cluster-wide Scheduled class default for the request’s stream (with
wildcard fallback). If noScheduledrow is configured, the query gets *High priority and the server-default timeout — the platform fast-paths its own work. -
Otherwise, the entity’s own per-entity row for that class and stream (with
wildcard fallback). For users, this is whatever you set on the *User detail page. -
For users, the union of every team the user belongs to. When multiple teams supply values, Kloudfuse picks the highest priority and the lowest timeout — the most-permissive priority and the strictest timeout win.
-
The cluster-wide class default for the request’s stream (with
wildcard fallback) — set on the *Global query priorities page.
The first match in this chain wins. The Source column on the Effective display in the UI tells you which step provided the resolved value (for example, class default (*), user override (apm), team "oncall" (logs)).
When to Configure What
| Goal | Where to configure |
|---|---|
Set safe baselines for every user and service account in the cluster |
Global query priorities — Interactive and Machine class defaults. |
Cap how long alert / scheduled-view queries may run |
Global query priorities — Scheduled class default. |
Give a specific service account more headroom than the cluster default |
Service Account detail page — Machine class. |
Give an oncall team faster log queries than other teams |
Team detail page — Interactive class, Logs per-stream override. |
Throttle a single user who is overwhelming the cluster |
User detail page — Lower the priority or timeout for the relevant stream. |