Metrics roll up
Kloudfuse now supports roll up of metrics data, computed directly from the data stream.
To improve query performance and reduce loading times, Kloudfuse computes and aggregates metrics data in aggregated intervals directly from the data stream. Depending on the time span of the query, Kloudfuse calculates results either from raw data, or from rolled up data. In the shorter time spans, we continue to use raw metrics because the calculation approach could potentially smooth out the data and potentially miss important signals, such as outliers.
|
-
To configure metric rollup, see Configure metrics rollup.
-
To learn how Kloudfuse creates and uses rolled up metrics, see Metrics roll up processes.
-
For an in-depth look at the various components of our implementation, see Elements of metrics roll up.
For the more general discussion of this feature, read these sections:
Benefits
The primary benefit of this approach is a reduced I/O cost, as Kloudfuse samples aggregate metrics instead of raw values. Query performance improves by these pre-calculated aggregates. And quicker calculation means faster loading results for dashboards and graphs. Additionally, it is relatively inexpensive to increase retention times for these aggregated metrics.
Consider the number of metrics that your system processes regularly. The following image is a plot of select monitored metrics as they appear in the Kloudfuse plane:
In situations where the raw data stream has intervals of 15 or 30 seconds, compare the number of records that each query processes with the number of records when using pre-aggregated metrics data at 5 minute interval, and at 10 minute interval. When the data stream is at 15 or 30 seconds, using rolled up (pre-aggregated) metrics at 5 minutes improves efficiency by reducing the data retrieval time by a factor of 20 or 10, respectively. With a roll up interval of 10 minutes, data retrieval performance improves by a factor of 40 or 20, respectively.
Query Duration |
Number of stored records |
|||||||
1 metric |
200 metrics |
|||||||
Raw data |
Rolled up data |
Raw data |
Rolled up data |
|||||
15s |
30s |
5 min |
10 min |
15s |
30s |
5 min |
10 min |
|
6h |
1,440 |
720 |
120 |
60 |
288 K |
14 K |
24 K |
12 K |
2d |
11,520 |
5,760 |
960 |
480 |
2,304 K |
1,152 K |
19.2 K |
96 K |
7d |
40,320 |
20,160 |
3,360 |
1,680 |
8,064 K |
4,032 K |
672 K |
336 K |
2w |
80,640 |
40,320 |
6,720 |
3,360 |
16,126 K |
8,063 K |
1,344 K |
672 K |
1mo=30d |
172,800 |
86,400 |
14,400 |
7,200 |
34,560 K |
17,280 K |
2,880 K |
1,440 K |
1y=365.25d |
2,104 K |
1,052 K |
175 K |
88 K |
420,768 K |
210,384 K |
35,064 K |
17,532 K |