Continuous Profiling with Kloudfuse Profiler Server
Continuous Profiling is a powerful addition to the Kloudfuse observability platform. While traditional monitoring methods —- metrics, logs, and tracing — provide valuable insights, they leave gaps in understanding application performance at a granular level. Continuous Profiling addresses this lack of coverage by offering in-depth, line-level insights into the application’s code, exposing the precise details of resource utilization.
This low-overhead feature gathers profiles from production systems and stores them for subsequent analysis. It provides a comprehensive view of the application and its behavior in production, including CPU usage, memory allocation, and disk I/O, and ensures that every line of code operates efficiently.
Benefits of Continuous Profiling
- Granular Insights
-
Continuous Profiling offers a detailed view of application performance that goes beyond traditional observability tools, providing line-level insights into resource utilization.
- In-Depth Code Analysis
-
With a comprehensive understanding of code performance and system interactions, developers can easily identify how specific code segments use resources, facilitating thorough analysis and optimization.
- Optimization Opportunities
-
By pinpointing inefficient lines of code, Continuous Profiling helps address performance bottlenecks and improve resource utilization across all applications.
- Effective Capacity Planning
-
The profiling data supports informed capacity planning and scaling efforts, ensuring that your application can meet growing demands while maintaining optimal performance.
- Cost Reduction
-
By identifying resource spikes in CPU and memory usage, Continuous Profiling aids in optimizing these areas to lower costs.
Configuration
Enable kfuse-profiling
in the custom-values.yaml
file.
global:
kfuse-profiling:
enabled: true
Kloudfuse default configuration saves the data in the PVC with size 50GB.
Long-Term Retention
To retain profiling data for a longer time, change the configuration settings. Depending on the storage provider, configure one of the following options in the custom-values.yaml
file: AWS S3 or GCP Bucket. Remember that profiling data uses parquet storage format.
Scrape the Profiling Data
Alloy queries the pprof
endpoints of your Golang application, collects the profiles, and forwards them to the Kfuse Profiler server.
Prerequisites
-
Ensure your Golang application exposes
pprof
endpoints. -
In pull mode, the Alloy collector periodically retrieves profiles from Golang applications, specifically targeting the
/debug/pprof/*
endpoints. -
Set up Go profiling in pull mode to generate profiles. See Grafana documentation on Set up Go profiling in pull mode.
-
Set up Java profiling to generate profiles. See Grafana documentation on Java.
Configure Scraping
-
Configure the alloy scraper in a new file,
alloy-values.yaml
. Download a copy of the default alloy-values.yaml file from the Grafana repository, and customize thealloy configMap
section. -
Configure the
pyroscope.write
block in the Alloy Configuration file to define the endpoint where to send profiling data.Define Endpoint for Sending Profiling Datapyroscope.write "write_job_name" { (1) endpoint { url = "https://<KFUSE ENDPOINT/DNS NAME>/profile" (2) } }
1 Change write_job_name
to appropriate name, likekfuse_profiler_write
.2 Change url to https://<KFUSE ENDPOINT/DNS NAME>/profile
. -
Configure
pyroscope.scrape
block in the Alloy Configuration file to define the scraping configuration for profiling data.Define the Scraping Configuration for Profiling Datapyroscope.scrape "scrape_job_name" { (1) targets = concat(discovery.relabel.kubernetes_pods.output) (2) forward_to = [pyroscope.write.write_job_name.receiver] (3) profiling_config { (4) profile.process_cpu { (5) enabled = true } profile.godeltaprof_memory { (6) enabled = true } profile.memory { // disable memory, use godeltaprof_memory instead enabled = false } profile.godeltaprof_mutex { (7) enabled = true } profile.mutex { // disable mutex, use godeltaprof_mutex instead enabled = false } profile.godeltaprof_block { (8) enabled = true } profile.block { // disable block, use godeltaprof_block instead enabled = false } profile.goroutine { enabled = true (9) } } }
1 Change scrape_job_name
to an appropriate name, likekfuse_profiler_scrape
.2 Use discovery.relabel.kubernetes_pods.output
as a target forpyroscope.scrape
block to discover Kubernetes targets. Follow Grafana documentation to learn how to set up specific regex rules Discover Kubernetes targets.3 forward_to
: Connects the scrape job to the write job.4 profiling_config
: Enables or disables specific profiles.5 profile.process_cpu
: Enables CPU profiling.6 profile.godeltaprof_memory
: Enables delta memory profiling.7 profile.godeltaprof_mutex
: Enables delta mutex profiling.8 profile.godeltaprof_block
: Enablesdelta block
profiling.9 profile.goroutine
: Enablesgoroutine
profiling. -
Configure the rest of the fields in Alloy Configuration file.
Alloy Configurationalloy: configMap: create: true (1) content: |- (2) logging { (3) level = "info" format = "logfmt" } discovery.kubernetes "pyroscope_kubernetes" { role = "pod" } discovery.relabel "kubernetes_pods" { targets = concat(discovery.kubernetes.pyroscope_kubernetes.targets) rule { action = "drop" source_labels = ["__meta_kubernetes_pod_phase"] regex = "Pending|Succeeded|Failed|Completed" } rule { action = "labelmap" regex = "__meta_kubernetes_pod_label_(.+)" } rule { action = "replace" source_labels = ["__meta_kubernetes_namespace"] target_label = "kubernetes_namespace" } rule { action = "replace" source_labels = ["__meta_kubernetes_pod_name"] target_label = "kubernetes_pod_name" } rule { action = "keep" source_labels = ["__meta_kubernetes_pod_annotation_pyroscope_io_scrape"] regex = "true" } rule { action = "replace" source_labels = ["__meta_kubernetes_pod_annotation_pyroscope_io_application_name"] target_label = "service_name" } rule { action = "replace" source_labels = ["__meta_kubernetes_pod_annotation_pyroscope_io_spy_name"] target_label = "__spy_name__" } rule { action = "replace" source_labels = ["__meta_kubernetes_pod_annotation_pyroscope_io_scheme"] regex = "(https?)" target_label = "__scheme__" } rule { action = "replace" source_labels = ["__address__", "__meta_kubernetes_pod_annotation_pyroscope_io_port"] regex = "(.+?)(?::\\d+)?;(\\d+)" replacement = "$1:$2" target_label = "__address__" } rule { action = "labelmap" regex = "__meta_kubernetes_pod_annotation_pyroscope_io_profile_(.+)" replacement = "__profile_$1" } } pyroscope.scrape "pyroscope_scrape" { (4) clustering { enabled = true } targets = concat(discovery.relabel.kubernetes_pods.output) forward_to = [pyroscope.write.pyroscope_write.receiver] profiling_config { profile.memory { enabled = true } profile.process_cpu { enabled = true } profile.goroutine { enabled = true } profile.block { enabled = false } profile.mutex { enabled = false } profile.fgprof { enabled = false } } } pyroscope.write "pyroscope_write" { endpoint { url = "https://<KFUSE ENDPOINT/DNS NAME>/profile" } } name: null (5) key: null (6) clustering: enabled: false (7) name: "" (8) portName: http (9) stabilityLevel: "generally-available" (10) storagePath: /tmp/alloy (11) listenAddr: 0.0.0.0 (12) listenPort: 12345 (13) listenScheme: HTTP (14) uiPathPrefix: / (15) enableReporting: true (16) extraEnv: [] (17) envFrom: [] (18) extraArgs: [] (19) extraPorts: [] (20) # - name: "faro" # port: 12347 # targetPort: 12347 # protocol: "TCP" # appProtocol: "h2c" mounts: varlog: false (21) dockercontainers: false extra: [] (22) securityContext: {} (23) resources: {} (24)
1 create: true
Create a new ConfigMap for the config file.2 content:
Content to assign to the newConfigMap
. This passes intotpl
, allowing for templating from values.3 Start Alloy config 4 pyroscope.scrape
: Specifies the targets sources for scrape profiling data. See5 name
: Name of existing ConfigMap to use. Used when create is false.6 key
: Key in ConfigMap, to get config.7 clustering enabled
: Deploy Alloy in a cluster, to allow for load distribution.8 clustering name
: Name for the Alloy cluster. Used for differentiating between clusters.9 portName
: Name of the port for clustering, useful if running inside an Istio Mesh10 stabilityLevel
: Minimum stability level of components and behavior to enable. Must be one ofexperimental
,public-preview
, orgenerally-available
.11 storagePath
: Path to where Grafana Alloy stores data, such as Write-Ahead Log. By default, data is lost between reboots.12 listenAddr
: Address for listening to traffic. Value of 0.0.0.0 exposes the UI to other containers.13 listenPort
: Port for listening to traffic.14 listenScheme
: Scheme for readiness probes. If enablingtls
in your configs, set to "HTTPS"15 uiPathPrefix
: Base path that exposes the UI.16 enableReporting
: Enables sending Grafana Labs anonymous usage stats to help improve Grafana Alloy.17 # — Extra environment variables to pass to the Alloy container. extraEnv: [] 18 envFrom
: Maps all keys on a ConfigMap or Secret as environment variables. See Kubernetes documentation for EnvFromSource v1 core.19 extraArgs
: Extra arguments to pass toalloy run
. See Grafana documentation for The run command.20 extraPorts
: Extra ports to expose on the Alloy container.21 varlog
: Mount/var/log
from the host into the container to collect logs.22 extra
: Extra volume mounts to add into the Grafana Alloy container. Does not affect the watch container.23 securityContext
: Security context to apply to the Grafana Alloy container.24 resources
: Resource requests and limits to apply to the Grafana Alloy container. -
Refer to Grafana documentation on Deploy Grafana Alloy on Kubernetes to correctly install
alloy
in the namespace where you want to scrape the data. -
Update alloy using the
alloy-values.yaml
file you set up. Use the samenamespace
as in the previous step, where you installed alloy.helm upgrade --namespace <namespace> alloy grafana/alloy -f <path/to/alloy-values.yaml>