OpenTelemetry Collector Configuration

Md Nur MohammadMd Nur Mohammad
7 min read

The OpenTelemetry Collector (OTel Collector) is a vendor-agnostic service that helps you collect, process, and export telemetry data (metrics, logs, and traces) from your applications. It is a core component of the OpenTelemetry ecosystem and plays a critical role in observability pipelines.

Why Use the OpenTelemetry Collector?

  1. Centralized Data Pipeline: Receives telemetry from multiple sources (apps, agents, SDKs).

  2. Processing & Transformation: Filters, enriches, batches, or converts telemetry before exporting.

  3. Vendor Flexibility: Send data to multiple observability backends (e.g., Prometheus, Jaeger, Grafana, Datadog, etc.).

  4. Reduces SDK Load: Moves complex export logic out of your application code.


Core Components of OTel Collector

The Collector is made up of pipelines with 4 main types of components:

ComponentRole
ReceiverIngests data into the Collector (e.g., OTLP, Jaeger, Prometheus).
ProcessorModifies/enriches data (e.g., batching, filtering, transforming).
ExporterSends data to observability tools or storage (e.g., OTLP, Prometheus, Jaeger, Loki).
ExtensionAdds auxiliary features (e.g., health checks, authentication, zPages).

Example Use Case (Node.js App with Prometheus + Grafana + OTEL Collector)

  1. Your app sends telemetry to the OTEL Collector (e.g., via OTLP).

  2. Collector Receivers accept the data.

  3. Processors enrich or batch the data.

  4. Exporters send:

    • Metrics → Prometheus

    • Traces → Jaeger/Grafana Tempo

    • Logs → Loki


Example Collector Config (YAML)

receivers:
  otlp:
    protocols:
      http:
      grpc:

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"

  logging:
    loglevel: debug

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [prometheus, logging]

Collector Deployment Options

  • Standalone binary

  • Docker container

  • Kubernetes (via Helm or manifest)

  • Sidecar or DaemonSet


Variants

  • Core Collector: Minimal, basic functionality.

  • Contrib Collector: Includes community-contributed receivers/exporters like Kafka, Redis, AWS X-Ray, etc.

OpenTelemetry Collector Configuration_OTEL_demo files

# Copyright The OpenTelemetry Authors
# SPDX-License-Identifier: Apache-2.0

receivers:  # Defines all the sources from which telemetry data will be received
  otlp:  # OTLP receiver to accept traces, metrics, and logs via HTTP and gRPC
    protocols:
      grpc:  # Enable gRPC protocol (used by most SDKs) for OTLP input
        endpoint: ${env:OTEL_COLLECTOR_HOST}:${env:OTEL_COLLECTOR_PORT_GRPC}
      http:  # Enable HTTP protocol for OTLP input, useful for browser apps
        endpoint: ${env:OTEL_COLLECTOR_HOST}:${env:OTEL_COLLECTOR_PORT_HTTP}
        cors:  # Enable CORS so browser-based apps can send telemetry data
          allowed_origins:
            - "http://*"
            - "https://*"
  httpcheck/frontend-proxy:  # Periodically checks the health of the frontend-proxy service
    targets:
      - endpoint: http://frontend-proxy:${env:ENVOY_PORT}
  docker_stats:  # Collects metrics directly from Docker engine (e.g., container stats)
    endpoint: unix:///var/run/docker.sock
  redis:  # Collects Redis metrics from valkey-cart container every 10s
    endpoint: "valkey-cart:6379"
    username: "valkey"
    collection_interval: 10s
  # Host metrics
  hostmetrics:  # Collects system-level metrics (CPU, memory, disk, etc.) from host
    root_path: /hostfs
    scrapers:
      cpu:
        metrics:
          system.cpu.utilization:
            enabled: true
      disk:
      load:
      filesystem:
        exclude_mount_points:
          mount_points:
            - /dev/*
            - /proc/*
            - /sys/*
            - /run/k3s/containerd/*
            - /var/lib/docker/*
            - /var/lib/kubelet/*
            - /snap/*
          match_type: regexp
        exclude_fs_types:
          fs_types:
            - autofs
            - binfmt_misc
            - bpf
            - cgroup2
            - configfs
            - debugfs
            - devpts
            - devtmpfs
            - fusectl
            - hugetlbfs
            - iso9660
            - mqueue
            - nsfs
            - overlay
            - proc
            - procfs
            - pstore
            - rpc_pipefs
            - securityfs
            - selinuxfs
            - squashfs
            - sysfs
            - tracefs
          match_type: strict
      memory:
        metrics:
          system.memory.utilization:
            enabled: true
      network:
      paging:
      processes:
      process:
        mute_process_exe_error: true
        mute_process_io_error: true
        mute_process_user_error: true

exporters:  # Defines where to send processed telemetry data (e.g., Jaeger, Prometheus)
  debug:  # Prints telemetry to standard output for testing
  otlp:  # Export traces to Jaeger using gRPC (OTLP)
    endpoint: "jaeger:4317"
    tls:
      insecure: true
  otlphttp/prometheus:  # Export metrics to Prometheus using OTLP over HTTP
    endpoint: "http://prometheus:9090/api/v1/otlp"
    tls:
      insecure: true
  opensearch:  # Export logs to OpenSearch for log indexing
    logs_index: otel
    http:
      endpoint: "http://opensearch:9200"
      tls:
        insecure: true

processors:  # Processors are used to modify, batch, or limit telemetry data before export
  batch:  # Efficiently batch telemetry before sending
  memory_limiter:  # Prevents the collector from using too much memory
    check_interval: 5s
    limit_percentage: 80
    spike_limit_percentage: 25
  transform:  # Applies transformations to trace spans (e.g., clean URL patterns)
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          # could be removed when https://github.com/vercel/next.js/pull/64852 is fixed upstream
          - replace_pattern(name, "\\?.*", "")  # Removes query strings from URLs
          - replace_match(name, "GET /api/products/*", "GET /api/products/{productId}")  # Generalize span names

connectors:  # Internal component that turns spans into Prometheus-compatible metrics
  spanmetrics:

service:  # Service defines telemetry pipelines (traces, metrics, logs)
  pipelines:
    traces:  # Tracing pipeline: receives OTLP → processes → exports to Jaeger & Prometheus
      receivers: [otlp]
      processors: [memory_limiter, transform, batch]
      exporters: [otlp, debug, spanmetrics]
    metrics:  # Metrics pipeline: collects from host/docker/redis → exports to Prometheus
      receivers: [hostmetrics, docker_stats, httpcheck/frontend-proxy, otlp, redis, spanmetrics]
      processors: [memory_limiter, batch]
      exporters: [otlphttp/prometheus, debug]
    logs:  # Logs pipeline: receives OTLP logs → exports to OpenSearch
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [opensearch, debug]
  telemetry:  # Enables observability for the collector itself
    metrics:
      level: detailed
      readers:
        - periodic:  # Periodically exports collector’s own metrics
            interval: 10000
            timeout: 5000
            exporter:
              otlp:
                protocol: http/protobuf
                endpoint: ${env:OTEL_COLLECTOR_HOST}:${env:OTEL_COLLECTOR_PORT_HTTP}

What is This File?

This YAML file is the brain of your observability pipeline.
It tells the OpenTelemetry Collector how to:

  1. Receive telemetry data (from services, Docker, Redis, etc.)

  2. Process it (normalize, limit memory, batch it)

  3. Send it to the right places (Prometheus for metrics, Jaeger for traces, OpenSearch for logs)


1. Receivers – Who’s sending data?

receivers:
  otlp:
    protocols:
      grpc:
      http:
  • This allows your microservices to send traces, metrics, and logs using OTLP protocol (OpenTelemetry standard).

  • Supports both gRPC and HTTP.

  • Also configures CORS, so even browser apps can send data.


More Receivers

httpcheck/frontend-proxy

  httpcheck/frontend-proxy:
    targets:
      - endpoint: http://frontend-proxy:${env:ENVOY_PORT}

Think of it as a health monitor.

  • It sends pings to your frontend (proxy) to check if it's alive.

docker_stats

  docker_stats:
    endpoint: unix:///var/run/docker.sock
  • It listens to Docker directly.

  • It grabs container-level stats: CPU, memory, IO, etc.

redis

  redis:
    endpoint: "valkey-cart:6379"
    username: "valkey"
    collection_interval: 10s
  • Pulls Redis performance data every 10 seconds.

hostmetrics

  hostmetrics:
    root_path: /hostfs
    scrapers:
      cpu:
      memory:

Gets low-level system metrics (CPU, memory, disk, etc.) from the host machine.

  • Filters out unnecessary mount points and filesystem types to avoid noise.

2. Processors – What to do with the data?

memory_limiter

  memory_limiter:
    limit_percentage: 80
  • Ensures the Collector doesn’t consume too much RAM.

batch

  • Groups telemetry data before sending it out — efficient and safe.

transform

  transform:
    trace_statements:
      - replace_match(name, "GET /api/products/*", "GET /api/products/{productId}")
  • Cleans up span names. Instead of showing 100 different product IDs in traces, it generalizes the path.

3. Exporters – Where to send data?

exporters:
  otlp:                 Jaeger (Traces)
  otlphttp/prometheus:  Prometheus (Metrics)
  opensearch:           OpenSearch (Logs)
  debug:                Console output

You’re sending:

  • TracesJaeger

  • MetricsPrometheus

  • LogsOpenSearch

  • Also printing everything with debug for testing


4. Connectors – Convert one type to another

connectors:
  spanmetrics:

Converts trace spans into Prometheus-compatible metrics.

  • Useful for visualizing latency and error rate trends in dashboards.

5. Service Pipelines – The flowchart

Each telemetry type (traces, metrics, logs) has its own processing path:

🔸 Traces

receivers: [otlp]
processors: [memory_limiter, transform, batch]
exporters: [otlp, debug, spanmetrics]

🔸 Metrics

receivers: [hostmetrics, docker_stats, httpcheck, otlp, redis, spanmetrics]
processors: [memory_limiter, batch]
exporters: [otlphttp/prometheus, debug]

🔸 Logs

receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [opensearch, debug]

6. Telemetry for the Collector Itself

telemetry:
  metrics:
    level: detailed

This is meta-observability — the collector is monitoring itself and exporting its own health data.


Final Mental Model

Think of the Collector as a data router:

  • It listens to microservices, Docker, Redis, and the host.

  • It transforms and compresses data to reduce noise.

  • It sends this cleaned data to your chosen destinations for visualization and storage.

OpenTelemetry Collector Configuration

File: src/otel-collector/otelcol-config.yml
This is the main OTEL Collector configuration.

Key Points:

  • Receivers: Supports otlp (gRPC and HTTP), httpcheck, docker_stats, hostmetrics, redis

  • Exporters:

    • otlp → sends traces to Jaeger

    • otlphttp/prometheus → sends metrics to Prometheus

    • opensearch → exports logs

  • Processors: batch, memory_limiter, transform

  • Pipelines:

    • traces: receives OTLP, processes and exports to Jaeger + debug

    • metrics: exports to Prometheus

    • logs: exports to OpenSearch

It integrates well with Docker, Redis, Envoy, and Kubernetes environments.


STEP 3: Node.js SDK Initialization

File: src/payment/opentelemetry.js
This initializes OpenTelemetry SDK for the Node.js payment service.

Key Code Features:

  • Uses @opentelemetry/sdk-node with:

    • OTLPTraceExporter for exporting traces

    • OTLPMetricExporter + PeriodicExportingMetricReader for metrics

  • Includes multiple auto-instrumentations (e.g., filesystem, runtime, AWS, GCP, Docker, process, etc.)

  • Calls sdk.start() to initialize tracing + metric collection

This file is equivalent to your otel.js setup in a TODO app.


STEP 4: Prometheus Configuration

File: src/prometheus/prometheus-config.yaml

Highlights:

global:
  scrape_interval: 5s
  scrape_timeout: 3s

otlp:
  promote_resource_attributes:
    - service.name
    - service.instance.id
    ...

While this config focuses on OTLP attributes and TSDB behavior, you would add scrape targets here manually if needed, e.g.:

scrape_configs:
  - job_name: 'todo-app'
    static_configs:
      - targets: ['localhost:9464']

Summary Mapping

StepPurposeFile Path
2OTEL Collector setupsrc/otel-collector/otelcol-config.yml
3Node.js OTEL SDK initializationsrc/payment/opentelemetry.js
4Prometheus config for metric scrapingsrc/prometheus/prometheus-config.yaml
0
Subscribe to my newsletter

Read articles from Md Nur Mohammad directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Md Nur Mohammad
Md Nur Mohammad

I am pursuing a Master's in Communication Systems and Networks at the Cologne University of Applied Sciences, Germany.