Continuous Profiling with Kloudfuse Profiler Server
Continuous Profiling is a powerful addition to the Kloudfuse observability platform. While traditional monitoring methods —- metrics, logs, and tracing — provide valuable insights, they leave gaps in understanding application performance at a granular level. Continuous Profiling addresses this lack of coverage by offering in-depth, line-level insights into the application’s code, exposing the precise details of resource utilization.
This low-overhead feature gathers profiles from production systems and stores them for subsequent analysis. It provides a comprehensive view of the application and its behavior in production, including CPU usage, memory allocation, and disk I/O, and ensures that every line of code operates efficiently.
Benefits of Continuous Profiling
- Granular Insights
-
Continuous Profiling offers a detailed view of application performance that goes beyond traditional observability tools, providing line-level insights into resource utilization.
- In-Depth Code Analysis
-
With a comprehensive understanding of code performance and system interactions, developers can easily identify how specific code segments use resources, facilitating thorough analysis and optimization.
- Optimization Opportunities
-
By pinpointing inefficient lines of code, Continuous Profiling helps address performance bottlenecks and improve resource utilization across all applications.
- Effective Capacity Planning
-
The profiling data supports informed capacity planning and scaling efforts, ensuring that your application can meet growing demands while maintaining optimal performance.
- Cost Reduction
-
By identifying resource spikes in CPU and memory usage, Continuous Profiling aids in optimizing these areas to lower costs.
Configuration
Enable kfuse-profiling in the custom-values.yaml file.
global:
kfuse-profiling:
enabled: true
Kloudfuse default configuration saves the data in the PVC with size 50GB.
Long-Term Retention
To retain profiling data for a longer time, change the configuration settings. Depending on the storage provider, configure one of the following options in the custom-values.yaml file: AWS S3 or GCP Bucket. Remember that profiling data uses parquet storage format.
Scrape the Profiling Data
Alloy queries the pprof endpoints of your Golang application, collects the profiles, and forwards them to the Kfuse Profiler server.
Prerequisites
-
Ensure your Golang application exposes
pprofendpoints. -
In pull mode, the Alloy collector periodically retrieves profiles from Golang applications, specifically targeting the
/debug/pprof/*endpoints. -
Set up Go profiling in pull mode to generate profiles. See Grafana documentation on Set up Go profiling in pull mode.
-
Set up Java profiling to generate profiles. See Grafana documentation on Java.
Configure Scraping
-
Configure the alloy scraper in a new file,
alloy-values.yaml. Download a copy of the default alloy-values.yaml file from the Grafana repository, and customize thealloy configMapsection. -
Configure the
pyroscope.writeblock in the Alloy Configuration file to define the endpoint where to send profiling data.Define Endpoint for Sending Profiling Datapyroscope.write "write_job_name" { (1) endpoint { url = "https://<KFUSE ENDPOINT/DNS NAME>/profile" (2) } }yaml1 Change write_job_nameto appropriate name, likekfuse_profiler_write.2 Change url to https://<KFUSE ENDPOINT/DNS NAME>/profile. -
Configure
pyroscope.scrapeblock in the Alloy Configuration file to define the scraping configuration for profiling data.Define the Scraping Configuration for Profiling Datapyroscope.scrape "scrape_job_name" { (1) targets = concat(discovery.relabel.kubernetes_pods.output) (2) forward_to = [pyroscope.write.write_job_name.receiver] (3) profiling_config { (4) profile.process_cpu { (5) enabled = true } profile.godeltaprof_memory { (6) enabled = true } profile.memory { // disable memory, use godeltaprof_memory instead enabled = false } profile.godeltaprof_mutex { (7) enabled = true } profile.mutex { // disable mutex, use godeltaprof_mutex instead enabled = false } profile.godeltaprof_block { (8) enabled = true } profile.block { // disable block, use godeltaprof_block instead enabled = false } profile.goroutine { enabled = true (9) } } }yaml1 Change scrape_job_nameto an appropriate name, likekfuse_profiler_scrape.2 Use discovery.relabel.kubernetes_pods.outputas a target forpyroscope.scrapeblock to discover Kubernetes targets. Follow Grafana documentation to learn how to set up specific regex rules Discover Kubernetes targets.3 forward_to: Connects the scrape job to the write job.4 profiling_config: Enables or disables specific profiles.5 profile.process_cpu: Enables CPU profiling.6 profile.godeltaprof_memory: Enables delta memory profiling.7 profile.godeltaprof_mutex: Enables delta mutex profiling.8 profile.godeltaprof_block: Enablesdelta blockprofiling.9 profile.goroutine: Enablesgoroutineprofiling. -
Configure the rest of the fields in Alloy Configuration file.
Alloy Configurationalloy: configMap: create: true (1) content: |- (2) logging { (3) level = "info" format = "logfmt" } discovery.kubernetes "pyroscope_kubernetes" { role = "pod" } discovery.relabel "kubernetes_pods" { targets = concat(discovery.kubernetes.pyroscope_kubernetes.targets) rule { action = "drop" source_labels = ["__meta_kubernetes_pod_phase"] regex = "Pending|Succeeded|Failed|Completed" } rule { action = "labelmap" regex = "__meta_kubernetes_pod_label_(.+)" } rule { action = "replace" source_labels = ["__meta_kubernetes_namespace"] target_label = "kubernetes_namespace" } rule { action = "replace" source_labels = ["__meta_kubernetes_pod_name"] target_label = "kubernetes_pod_name" } rule { action = "keep" source_labels = ["__meta_kubernetes_pod_annotation_pyroscope_io_scrape"] regex = "true" } rule { action = "replace" source_labels = ["__meta_kubernetes_pod_annotation_pyroscope_io_application_name"] target_label = "service_name" } rule { action = "replace" source_labels = ["__meta_kubernetes_pod_annotation_pyroscope_io_spy_name"] target_label = "__spy_name__" } rule { action = "replace" source_labels = ["__meta_kubernetes_pod_annotation_pyroscope_io_scheme"] regex = "(https?)" target_label = "__scheme__" } rule { action = "replace" source_labels = ["__address__", "__meta_kubernetes_pod_annotation_pyroscope_io_port"] regex = "(.+?)(?::\\d+)?;(\\d+)" replacement = "$1:$2" target_label = "__address__" } rule { action = "labelmap" regex = "__meta_kubernetes_pod_annotation_pyroscope_io_profile_(.+)" replacement = "__profile_$1" } } pyroscope.scrape "pyroscope_scrape" { (4) clustering { enabled = true } targets = concat(discovery.relabel.kubernetes_pods.output) forward_to = [pyroscope.write.pyroscope_write.receiver] profiling_config { profile.memory { enabled = true } profile.process_cpu { enabled = true } profile.goroutine { enabled = true } profile.block { enabled = false } profile.mutex { enabled = false } profile.fgprof { enabled = false } } } pyroscope.write "pyroscope_write" { endpoint { url = "https://<KFUSE ENDPOINT/DNS NAME>/profile" } } name: null (5) key: null (6) clustering: enabled: false (7) name: "" (8) portName: http (9) stabilityLevel: "generally-available" (10) storagePath: /tmp/alloy (11) listenAddr: 0.0.0.0 (12) listenPort: 12345 (13) listenScheme: HTTP (14) uiPathPrefix: / (15) enableReporting: true (16) extraEnv: [] (17) envFrom: [] (18) extraArgs: [] (19) extraPorts: [] (20) # - name: "faro" # port: 12347 # targetPort: 12347 # protocol: "TCP" # appProtocol: "h2c" mounts: varlog: false (21) dockercontainers: false extra: [] (22) securityContext: {} (23) resources: {} (24)yaml1 create: trueCreate a new ConfigMap for the config file.2 content:Content to assign to the newConfigMap. This passes intotpl, allowing for templating from values.3 Start Alloy config 4 pyroscope.scrape: Specifies the targets sources for scrape profiling data. See5 name: Name of existing ConfigMap to use. Used when create is false.6 key: Key in ConfigMap, to get config.7 clustering enabled: Deploy Alloy in a cluster, to allow for load distribution.8 clustering name: Name for the Alloy cluster. Used for differentiating between clusters.9 portName: Name of the port for clustering, useful if running inside an Istio Mesh10 stabilityLevel: Minimum stability level of components and behavior to enable. Must be one ofexperimental,public-preview, orgenerally-available.11 storagePath: Path to where Grafana Alloy stores data, such as Write-Ahead Log. By default, data is lost between reboots.12 listenAddr: Address for listening to traffic. Value of 0.0.0.0 exposes the UI to other containers.13 listenPort: Port for listening to traffic.14 listenScheme: Scheme for readiness probes. If enablingtlsin your configs, set to "HTTPS"15 uiPathPrefix: Base path that exposes the UI.16 enableReporting: Enables sending Grafana Labs anonymous usage stats to help improve Grafana Alloy.17 # — Extra environment variables to pass to the Alloy container. extraEnv: [] 18 envFrom: Maps all keys on a ConfigMap or Secret as environment variables. See Kubernetes documentation for EnvFromSource v1 core.19 extraArgs: Extra arguments to pass toalloy run. See Grafana documentation for The run command.20 extraPorts: Extra ports to expose on the Alloy container.21 varlog: Mount/var/logfrom the host into the container to collect logs.22 extra: Extra volume mounts to add into the Grafana Alloy container. Does not affect the watch container.23 securityContext: Security context to apply to the Grafana Alloy container.24 resources: Resource requests and limits to apply to the Grafana Alloy container. -
Refer to Grafana documentation on Deploy Grafana Alloy on Kubernetes to correctly install
alloyin the namespace where you want to scrape the data. -
Update alloy using the
alloy-values.yamlfile you set up. Use the samenamespaceas in the previous step, where you installed alloy.helm upgrade --namespace <namespace> alloy grafana/alloy -f <path/to/alloy-values.yaml>