OpenTelemetry Collector on Kubernetes Environment
The OpenTelemetry Collector supports receiving and exporting Metrics, Logs, and Traces on the Kubernetes environment.
Basic OpenTelemetry Deployment on Kubernetes Environments
When deploying OpenTelemetry Collector on a Kubernetes Environment, refer to Basic OpenTelemetry Integration on Kubernetes, a sample helm-values.yaml file for configuring OpenTelemetry to export logs, metrics, and traces to Kloudfuse.
Ensure that you use either https or http correctly, in conjunction with tls being enabled or disabled.
image:
repository: "otel/opentelemetry-collector-contrib"
config:
exporters:
logging:
verbosity: basic
otlphttp:
tls:
insecure: true # add only if you're using insecure communication
metrics_endpoint: https://<REPLACE KFUSE ADDRESS>/ingester/otlp/metrics
extensions:
health_check: {}
memory_ballast:
size_in_percentage: 40
processors:
batch:
timeout: 10s
resourcedetection:
detectors:
- env
- eks
- ec2
- gcp
- aks
- azure
override: false
timeout: 2s
receivers:
otlp:
protocols:
grpc:
endpoint: ${env:MY_POD_IP}:4317
http:
cors:
allowed_origins:
- http://*
- https://*
endpoint: 0.0.0.0:4318
service:
extensions:
- health_check
pipelines:
metrics:
exporters:
- otlphttp
processors:
- batch
- resourcedetection
receivers:
- otlp
telemetry:
metrics:
address: ${MY_POD_IP}:8888
mode: daemonset
service:
enabled: true
nameOverride: otelcol
ports:
metrics:
enabled: true
otlp:
enabled: true
otlp-http:
enabled: true
presets:
kubernetesAttributes:
enabled: true
resources: {}
Install in Default Namespace
Install the otel repo in a default namespace.
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm upgrade --install opentelemetry-collector open-telemetry/opentelemetry-collector -f values.yaml
Kubernetes and System-Level Metrics
Kloudfuse APM Services integrate with Kubernetes and System Metrics to show related infrastructure metrics. See Service Detail Infrastructure.
You can export these metrics using either Datadog agent, or the OpenTelemetry collector.
When using the OpenTelemetry Collector, you must update it with additional configurations. Specifically, enable hostmetrics, kubeletstats, k8s_cluster receivers, and add additional kf_metrics_agent resource attributes. See Enable Additional Attributes for Kubernetes and System-Level Metrics. Note that k8s_cluster receiver uses k8s_leader_elector extension to prevent duplicate cluster metrics being sent. This is available in Open Telemetry Collector Contrib version of at least 0.130.0. Also, ensure that transform/k8spodphase processor is also added and included in the metrics pipeline. This processor ensures that the phase is added as label in k8s_pod_phase metrics emitted by the k8s_cluster receiver.
...
command:
extraArgs: [--feature-gates=receiver.kubeletstats.enableCPUUsageMetrics]
...
config:
...
extensions:
k8s_leader_elector/k8scluster:
auth_type: serviceAccount
lease_name: k8sclusterlease
lease_namespace: default # Replace with namespace where the lease resource will be added.
processors:
batch:
timeout: 10s
resource:
attributes:
- key: kf_metrics_agent
value: "otlp"
action: upsert
resourcedetection:
detectors:
- env
- eks
- ec2
- gcp
- aks
- azure
override: false
timeout: 2s
transform/k8spodphase:
metric_statements:
- context: datapoint
statements:
# Convert numeric phase value to string label
- 'set(attributes["phase"], "Pending") where metric.name == "k8s.pod.phase" and value_int == 1'
- 'set(attributes["phase"], "Running") where metric.name == "k8s.pod.phase" and value_int == 2'
- 'set(attributes["phase"], "Succeeded") where metric.name == "k8s.pod.phase" and value_int == 3'
- 'set(attributes["phase"], "Failed") where metric.name == "k8s.pod.phase" and value_int == 4'
- 'set(attributes["phase"], "Unknown") where metric.name == "k8s.pod.phase" and value_int == 5'
receivers:
hostmetrics:
collection_interval: 30s
scrapers:
cpu:
metrics:
system.cpu.utilization:
enabled: true
memory:
metrics:
system.memory.utilization:
enabled: true
disk:
filesystem:
metrics:
system.filesystem.utilization:
enabled: true
kubeletstats:
collection_interval: 30s
auth_type: "serviceAccount"
endpoint: "https://${env:K8S_NODE_NAME}:10250"
insecure_skip_verify: true
node: '${env:K8S_NODE_NAME}'
collect_all_network_interfaces:
pod: true
node: true
extra_metadata_labels:
- k8s.volume.type
metric_groups:
- node
- pod
- container
- volume
metrics:
k8s.container.cpu.node.utilization:
enabled: true
k8s.pod.cpu.node.utilization:
enabled: true
k8s.container.memory.node.utilization:
enabled: true
k8s.pod.memory.node.utilization:
enabled: true
k8s_cluster:
k8s_leader_elector: k8s_leader_elector/k8scluster
collection_interval: 30s
allocatable_types_to_report:
- cpu
- memory
- pods
auth_type: serviceAccount
metrics:
k8s.node.condition:
enabled: true
node_conditions_to_report:
- Ready
- MemoryPressure
resource_attributes:
container.id:
enabled: false
k8s.container.status.last_terminated_reason:
enabled: true
...
service:
pipelines:
...
metrics:
exporters: [otlphttp]
processors: [resource, resourcedetection, transform/k8spodphase, batch]
receivers: [otlp, kubeletstats, hostmetrics, k8s_cluster]
Specifically, see the instructions for integrating these data streams: