Metrics rollup processes

The following diagram illustrates how Kloudfuse handles metrics, from life stream processing, to queries from dashboards and alerts. It describes the stages of ingestion, processing, calculation, storage, and query processing.

metrics rollup architecture
Metrics ingestion, processing, calculation, storage, and queries

Metrics ingestion, processing, and storage

Refer to the upper part of the Metrics ingestion, processing, calculation, storage, and queries diagram that illustrates the rollup workflow. The numbers in light blue circles correspond to these steps:

  1. Kloudfuse gets time series data from your environment, either through agents or from cloud sources.

  2. The Ingester Service pre-processes the data stream, and routes it to Kafka as kf_metrics_topic.

  3. Kafka handles the same data stream in two parallel processes:

    Kafka forwards the kf_metrics_topic directly to Pinot.

  1. Pinot handles the topics in the following manner:

    The Metrics Decoder receives kf_metrics_topic, performs necessary calculations, and writes it to table kf_metrics.

    The table columns are: name (of metric), timestamp, labels, value, and le.

Metrics queries

Refer to the lower part of the diagram that illustrates the query workflow.

The numbers in the dark blue circles correspond to these steps:

  1. Kloudfuse gets a query request from a user interface.

    This may be triggered by starting the Metrics interface, loading dashboards, changing and reloading dashboards and reports, changing the time picker values, and so on.

  2. The Query Service selects the most appropriate rollup resolution based on the query’s step size and time range. For queries with a small step size or short time range, it reads from the raw table. For longer time ranges, it selects a coarser rollup resolution to reduce the amount of data scanned.

  3. The Query Service receives results from:

    Raw metrics

    Table kf_metrics

    Rollup metrics

    Table kf_metrics_rollup

    If the query spans a time range where only part of the data has rollup coverage, the query service automatically splits the query — using the rollup table for the newer portion and the raw table for the older portion.

  4. The Query Service combines the results and returns them to the requesting UI.