Metrics rollup processes
The following diagram illustrates how Kloudfuse handles metrics, from life stream processing, to queries from dashboards and alerts. It describes the stages of ingestion, processing, calculation, storage, and query processing.
Metrics ingestion, processing, and storage
Refer to the upper part of the Metrics ingestion, processing, calculation, storage, and queries diagram that illustrates the rollup workflow. The numbers in light blue circles correspond to these steps:
-
Kloudfuse gets time series data from your environment, either through agents or from cloud sources.
-
The Ingester Service pre-processes the data stream, and routes it to Kafka as kf_metrics_topic.
-
Kafka handles the same data stream in two parallel processes:
-
Raw metrics
-
Rollup metrics
Kafka forwards the
kf_metrics_topicdirectly to Pinot.Kafka uses
kf_metrics_topicto extract rollup metrics:-
It sends kf_metrics_topic to the Metrics Transformer.
-
The Metrics Transformer creates
kf_metrics_rollup_topicto calculate aggregations and markers for each configured rollup resolution (by default: 5 min, 10 min, 30 min, 1 hour, and 4 hours), and sends it back to Kafka. -
Kafka forwards
kf_metrics_rollup_topicto Pinot.
-
-
Pinot handles the topics in the following manner:
-
Raw metrics
-
Rollup metrics
The Metrics Decoder receives
kf_metrics_topic, performs necessary calculations, and writes it to tablekf_metrics.The table columns are: name (of metric), timestamp, labels, value, and le.
The Metrics Rollup Decoder receives the
kf_metrics_rollup_topic, performs necessary calculations and aggregations, and writes it to the tablekf_metrics_rollup.The table columns are name (of metric), timestamp, labels, sum, count, min, max, counter, first, first_ts, and le.
Kloudfuse calculates the aggregations sum, count, min, and max over the raw values in the other table.
Kloudfuse uses both counter (last counter value that accounts for resets within the rollup window), first (first value encounter in the bucket), and first_ts (timestamp of first) to ensure data integrity.
-
Metrics queries
Refer to the lower part of the diagram that illustrates the query workflow.
The numbers in the dark blue circles correspond to these steps:
-
Kloudfuse gets a query request from a user interface.
This may be triggered by starting the Metrics interface, loading dashboards, changing and reloading dashboards and reports, changing the time picker values, and so on.
-
The Query Service selects the most appropriate rollup resolution based on the query’s step size and time range. For queries with a small step size or short time range, it reads from the raw table. For longer time ranges, it selects a coarser rollup resolution to reduce the amount of data scanned.
-
The Query Service receives results from:
- Raw metrics
-
Table
kf_metrics
- Rollup metrics
-
Table
kf_metrics_rollupIf the query spans a time range where only part of the data has rollup coverage, the query service automatically splits the query — using the rollup table for the newer portion and the raw table for the older portion.
-
The Query Service combines the results and returns them to the requesting UI.