APM Service Identification and Metrics
Kloudfuse has a configurable Service Identity tool to provide logical separation of services based on multiple configurable labels. For example, the platform distinguishes “Service A” that runs in the production environment from “Service A” that runs in the staging environment by including the environment
label in the Service Identity Labels configuration.
The default configuration identifies services through multiple labels: availability_zone
, cloud_account_id
, kube_cluster_name
, kube_namespace
, project
, region
, kf_platform
, and service_name
.
Plan the service identify configuration carefully to correctly define the granularity for tracking service-level metrics in the Kloudfuse system. |
APM Service Metrics
The Kloudfuse APM solution offers features derived from span data. Some of these features, such as Service Map and Service List, rely on internally-generated metrics. The following illustration is an overview of the Kloudfuse system pipeline that generates the metrics based on span data.
These metrics are:
- Edge Latency Metrics
(edge_latency_*)
-
These metrics capture RED metrics for the Parent → Child edges, based on span data. The edge metrics have a fixed set of labels. The Kloudfuse team determines what these labels are, and may change them in future releases.
edge_latency_count
-
Represents the cumulative count of edges received (including dangling edges where either the parent or the child is missing). This metric can be used to construct Request-per-second and related queries.
edge_latency_sum
-
Represents the cumulative sum duration/latency data received for the processed edges. This metric, combined with edge_latency_count, can be used to construct Average-Duration-per-second and related queries.
edge_latency_bucket
-
Calculates the P-* duration/latency, such as P99, of the histogram bucket for specified time periods.
edge_latency_min
-
Calculates the minimum value of the histogram bucket for specified time periods.
edge_latency_max
-
Calculates the maximum value of the histogram bucket for specified time periods.
- Node Latency Metrics (
service_latency_*
) -
Similar to the edge latency metrics, the node latency metrics are derive from individual server spans. These metrics are emitted without established parent-child relationships in the span data. The node metrics are also emitted at the cardinality of services; Kloudfuse performs the aggregation at the level of the Service Identity, not at a more granular level such as span name.