OpenTelemetry Collector on Kubernetes Environment

The OpenTelemetry Collector supports receiving and exporting Metrics, Logs, and Traces on the Kubernetes environment.

Basic OpenTelemetry Deployment on Kubernetes Environments

When deploying OpenTelemetry Collector on a Kubernetes Environment, refer to Basic OpenTelemetry Integration on Kubernetes, a sample helm-values.yaml file for configuring OpenTelemetry to export logs, metrics, and traces to Kloudfuse.

Ensure that you use either https or http correctly, in conjunction with tls being enabled or disabled.

Basic OpenTelemetry Integration on Kubernetes

image:
  repository: "otel/opentelemetry-collector-contrib"

config:
  exporters:
    logging:
      verbosity: basic
    otlphttp:
      tls:
        insecure: true  # add only if you're using insecure communication
      metrics_endpoint: https://<REPLACE KFUSE ADDRESS>/ingester/otlp/metrics

  extensions:
    health_check: {}
    memory_ballast:
      size_in_percentage: 40

  processors:
    batch:
      timeout: 10s
    resourcedetection:
      detectors:
        - env
        - eks
        - ec2
        - gcp
        - aks
        - azure
      override: false
      timeout: 2s

  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: ${env:MY_POD_IP}:4317
        http:
          cors:
            allowed_origins:
              - http://*
              - https://*
          endpoint: 0.0.0.0:4318

  service:
    extensions:
      - health_check
    pipelines:
      metrics:
        exporters:
          - otlphttp
        processors:
          - batch
          - resourcedetection
        receivers:
          - otlp

  telemetry:
    metrics:
      address: ${MY_POD_IP}:8888

mode: daemonset

service:
  enabled: true

nameOverride: otelcol

ports:
  metrics:
    enabled: true
  otlp:
    enabled: true
  otlp-http:
    enabled: true

presets:
  kubernetesAttributes:
    enabled: true

resources: {}

yaml

Install in Default Namespace

Install the otel repo in a default namespace.

helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm upgrade --install opentelemetry-collector open-telemetry/opentelemetry-collector -f values.yaml

Kubernetes and System-Level Metrics

Kloudfuse APM Services integrate with Kubernetes and System Metrics to show related infrastructure metrics. See Service Detail Infrastructure.

You can export these metrics using either Datadog agent, or the OpenTelemetry collector.

When using the OpenTelemetry Collector, you must update it with additional configurations. Specifically, enable hostmetrics, kubeletstats, k8s_cluster receivers, and add additional kf_metrics_agent resource attributes. See Enable Additional Attributes for Kubernetes and System-Level Metrics. Note that k8s_cluster receiver uses k8s_leader_elector extension to prevent duplicate cluster metrics being sent. This is available in Open Telemetry Collector Contrib version of at least 0.130.0. Also, ensure that transform/k8spodphase processor is also added and included in the metrics pipeline. This processor ensures that the phase is added as label in k8s_pod_phase metrics emitted by the k8s_cluster receiver.

Enable Additional Attributes for Kubernetes and System-Level Metrics

...
command:
  extraArgs: [--feature-gates=receiver.kubeletstats.enableCPUUsageMetrics]
...
config:
  ...
  extensions:
    k8s_leader_elector/k8scluster:
      auth_type: serviceAccount
      lease_name: k8sclusterlease
      lease_namespace: default # Replace with namespace where the lease resource will be added.
  processors:
    batch:
      timeout: 10s
    resource:
      attributes:
      - key: kf_metrics_agent
        value: "otlp"
        action: upsert
    resourcedetection:
      detectors:
        - env
        - eks
        - ec2
        - gcp
        - aks
        - azure
      override: false
      timeout: 2s
    transform/k8spodphase:
      metric_statements:
        - context: datapoint
          statements:
            # Convert numeric phase value to string label
            - 'set(attributes["phase"], "Pending") where metric.name == "k8s.pod.phase" and value_int == 1'
            - 'set(attributes["phase"], "Running") where metric.name == "k8s.pod.phase" and value_int == 2'
            - 'set(attributes["phase"], "Succeeded") where metric.name == "k8s.pod.phase" and value_int == 3'
            - 'set(attributes["phase"], "Failed") where metric.name == "k8s.pod.phase" and value_int == 4'
            - 'set(attributes["phase"], "Unknown") where metric.name == "k8s.pod.phase" and value_int == 5'
  receivers:
    hostmetrics:
      collection_interval: 30s
      scrapers:
        cpu:
          metrics:
            system.cpu.utilization:
              enabled: true
        memory:
          metrics:
            system.memory.utilization:
              enabled: true
        disk:
        filesystem:
          metrics:
            system.filesystem.utilization:
              enabled: true
    kubeletstats:
      collection_interval: 30s
      auth_type: "serviceAccount"
      endpoint: "https://${env:K8S_NODE_NAME}:10250"
      insecure_skip_verify: true
      node: '${env:K8S_NODE_NAME}'
      collect_all_network_interfaces:
        pod: true
        node: true
      extra_metadata_labels:
        - k8s.volume.type
      metric_groups:
        - node
        - pod
        - container
        - volume
      metrics:
        k8s.container.cpu.node.utilization:
          enabled: true
        k8s.pod.cpu.node.utilization:
          enabled: true
        k8s.container.memory.node.utilization:
          enabled: true
        k8s.pod.memory.node.utilization:
          enabled: true
    k8s_cluster:
      k8s_leader_elector: k8s_leader_elector/k8scluster
      collection_interval: 30s
      allocatable_types_to_report:
      - cpu
      - memory
      - pods
      auth_type: serviceAccount
      metrics:
        k8s.node.condition:
          enabled: true
      node_conditions_to_report:
      - Ready
      - MemoryPressure
      resource_attributes:
        container.id:
          enabled: false
        k8s.container.status.last_terminated_reason:
          enabled: true
...
service:
  pipelines:
   ...
    metrics:
      exporters: [otlphttp]
      processors: [resource, resourcedetection, transform/k8spodphase, batch]
      receivers: [otlp, kubeletstats, hostmetrics, k8s_cluster]

yaml

Specifically, see the instructions for integrating these data streams: