The OpenTelemetry Collector (OTel Collector) is a vendor-agnostic service that helps you collect, process, and export telemetry data (metrics, logs, and traces) from your applications. It is a core component of the OpenTelemetry ecosystem and plays a critical role in observability pipelines.

Why Use the OpenTelemetry Collector?

Centralized Data Pipeline: Receives telemetry from multiple sources (apps, agents, SDKs).
Processing & Transformation: Filters, enriches, batches, or converts telemetry before exporting.
Vendor Flexibility: Send data to multiple observability backends (e.g., Prometheus, Jaeger, Grafana, Datadog, etc.).
Reduces SDK Load: Moves complex export logic out of your application code.

Core Components of OTel Collector

The Collector is made up of pipelines with 4 main types of components:

Component	Role
Receiver	Ingests data into the Collector (e.g., OTLP, Jaeger, Prometheus).
Processor	Modifies/enriches data (e.g., batching, filtering, transforming).
Exporter	Sends data to observability tools or storage (e.g., OTLP, Prometheus, Jaeger, Loki).
Extension	Adds auxiliary features (e.g., health checks, authentication, zPages).

Example Use Case (Node.js App with Prometheus + Grafana + OTEL Collector)

Your app sends telemetry to the OTEL Collector (e.g., via OTLP).
Collector Receivers accept the data.
Processors enrich or batch the data.
Exporters send:
- Metrics → Prometheus
- Traces → Jaeger/Grafana Tempo
- Logs → Loki

Example Collector Config (YAML)

receivers:
  otlp:
    protocols:
      http:
      grpc:

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"

  logging:
    loglevel: debug

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [prometheus, logging]

Collector Deployment Options

Standalone binary
Docker container
Kubernetes (via Helm or manifest)
Sidecar or DaemonSet

Variants

Core Collector: Minimal, basic functionality.
Contrib Collector: Includes community-contributed receivers/exporters like Kafka, Redis, AWS X-Ray, etc.

OpenTelemetry Collector Configuration_OTEL_demo files

# Copyright The OpenTelemetry Authors
# SPDX-License-Identifier: Apache-2.0

receivers:  # Defines all the sources from which telemetry data will be received
  otlp:  # OTLP receiver to accept traces, metrics, and logs via HTTP and gRPC
    protocols:
      grpc:  # Enable gRPC protocol (used by most SDKs) for OTLP input
        endpoint: ${env:OTEL_COLLECTOR_HOST}:${env:OTEL_COLLECTOR_PORT_GRPC}
      http:  # Enable HTTP protocol for OTLP input, useful for browser apps
        endpoint: ${env:OTEL_COLLECTOR_HOST}:${env:OTEL_COLLECTOR_PORT_HTTP}
        cors:  # Enable CORS so browser-based apps can send telemetry data
          allowed_origins:
            - "http://*"
            - "https://*"
  httpcheck/frontend-proxy:  # Periodically checks the health of the frontend-proxy service
    targets:
      - endpoint: http://frontend-proxy:${env:ENVOY_PORT}
  docker_stats:  # Collects metrics directly from Docker engine (e.g., container stats)
    endpoint: unix:///var/run/docker.sock
  redis:  # Collects Redis metrics from valkey-cart container every 10s
    endpoint: "valkey-cart:6379"
    username: "valkey"
    collection_interval: 10s
  # Host metrics
  hostmetrics:  # Collects system-level metrics (CPU, memory, disk, etc.) from host
    root_path: /hostfs
    scrapers:
      cpu:
        metrics:
          system.cpu.utilization:
            enabled: true
      disk:
      load:
      filesystem:
        exclude_mount_points:
          mount_points:
            - /dev/*
            - /proc/*
            - /sys/*
            - /run/k3s/containerd/*
            - /var/lib/docker/*
            - /var/lib/kubelet/*
            - /snap/*
          match_type: regexp
        exclude_fs_types:
          fs_types:
            - autofs
            - binfmt_misc
            - bpf
            - cgroup2
            - configfs
            - debugfs
            - devpts
            - devtmpfs
            - fusectl
            - hugetlbfs
            - iso9660
            - mqueue
            - nsfs
            - overlay
            - proc
            - procfs
            - pstore
            - rpc_pipefs
            - securityfs
            - selinuxfs
            - squashfs
            - sysfs
            - tracefs
          match_type: strict
      memory:
        metrics:
          system.memory.utilization:
            enabled: true
      network:
      paging:
      processes:
      process:
        mute_process_exe_error: true
        mute_process_io_error: true
        mute_process_user_error: true

exporters:  # Defines where to send processed telemetry data (e.g., Jaeger, Prometheus)
  debug:  # Prints telemetry to standard output for testing
  otlp:  # Export traces to Jaeger using gRPC (OTLP)
    endpoint: "jaeger:4317"
    tls:
      insecure: true
  otlphttp/prometheus:  # Export metrics to Prometheus using OTLP over HTTP
    endpoint: "http://prometheus:9090/api/v1/otlp"
    tls:
      insecure: true
  opensearch:  # Export logs to OpenSearch for log indexing
    logs_index: otel
    http:
      endpoint: "http://opensearch:9200"
      tls:
        insecure: true

processors:  # Processors are used to modify, batch, or limit telemetry data before export
  batch:  # Efficiently batch telemetry before sending
  memory_limiter:  # Prevents the collector from using too much memory
    check_interval: 5s
    limit_percentage: 80
    spike_limit_percentage: 25
  transform:  # Applies transformations to trace spans (e.g., clean URL patterns)
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          # could be removed when https://github.com/vercel/next.js/pull/64852 is fixed upstream
          - replace_pattern(name, "\\?.*", "")  # Removes query strings from URLs
          - replace_match(name, "GET /api/products/*", "GET /api/products/{productId}")  # Generalize span names

connectors:  # Internal component that turns spans into Prometheus-compatible metrics
  spanmetrics:

service:  # Service defines telemetry pipelines (traces, metrics, logs)
  pipelines:
    traces:  # Tracing pipeline: receives OTLP → processes → exports to Jaeger & Prometheus
      receivers: [otlp]
      processors: [memory_limiter, transform, batch]
      exporters: [otlp, debug, spanmetrics]
    metrics:  # Metrics pipeline: collects from host/docker/redis → exports to Prometheus
      receivers: [hostmetrics, docker_stats, httpcheck/frontend-proxy, otlp, redis, spanmetrics]
      processors: [memory_limiter, batch]
      exporters: [otlphttp/prometheus, debug]
    logs:  # Logs pipeline: receives OTLP logs → exports to OpenSearch
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [opensearch, debug]
  telemetry:  # Enables observability for the collector itself
    metrics:
      level: detailed
      readers:
        - periodic:  # Periodically exports collector’s own metrics
            interval: 10000
            timeout: 5000
            exporter:
              otlp:
                protocol: http/protobuf
                endpoint: ${env:OTEL_COLLECTOR_HOST}:${env:OTEL_COLLECTOR_PORT_HTTP}

What is This File?

This YAML file is the brain of your observability pipeline.
It tells the OpenTelemetry Collector how to:

Receive telemetry data (from services, Docker, Redis, etc.)
Process it (normalize, limit memory, batch it)
Send it to the right places (Prometheus for metrics, Jaeger for traces, OpenSearch for logs)

1. Receivers – Who’s sending data?

receivers:
  otlp:
    protocols:
      grpc:
      http:

This allows your microservices to send traces, metrics, and logs using OTLP protocol (OpenTelemetry standard).
Supports both gRPC and HTTP.
Also configures CORS, so even browser apps can send data.

More Receivers

`httpcheck/frontend-proxy`

  httpcheck/frontend-proxy:
    targets:
      - endpoint: http://frontend-proxy:${env:ENVOY_PORT}

Think of it as a health monitor.

It sends pings to your frontend (proxy) to check if it's alive.

`docker_stats`

  docker_stats:
    endpoint: unix:///var/run/docker.sock

It listens to Docker directly.
It grabs container-level stats: CPU, memory, IO, etc.

`redis`

  redis:
    endpoint: "valkey-cart:6379"
    username: "valkey"
    collection_interval: 10s

Pulls Redis performance data every 10 seconds.

`hostmetrics`

  hostmetrics:
    root_path: /hostfs
    scrapers:
      cpu:
      memory:

Gets low-level system metrics (CPU, memory, disk, etc.) from the host machine.

Filters out unnecessary mount points and filesystem types to avoid noise.

2. Processors – What to do with the data?

`memory_limiter`

  memory_limiter:
    limit_percentage: 80

Ensures the Collector doesn’t consume too much RAM.

`batch`

Groups telemetry data before sending it out — efficient and safe.

`transform`

  transform:
    trace_statements:
      - replace_match(name, "GET /api/products/*", "GET /api/products/{productId}")

Cleans up span names. Instead of showing 100 different product IDs in traces, it generalizes the path.

3. Exporters – Where to send data?

exporters:
  otlp:                → Jaeger (Traces)
  otlphttp/prometheus: → Prometheus (Metrics)
  opensearch:          → OpenSearch (Logs)
  debug:               → Console output

You’re sending:

Traces → Jaeger
Metrics → Prometheus
Logs → OpenSearch
Also printing everything with debug for testing

4. Connectors – Convert one type to another

connectors:
  spanmetrics:

Converts trace spans into Prometheus-compatible metrics.

Useful for visualizing latency and error rate trends in dashboards.

5. Service Pipelines – The flowchart

Each telemetry type (traces, metrics, logs) has its own processing path:

🔸 Traces

receivers: [otlp]
processors: [memory_limiter, transform, batch]
exporters: [otlp, debug, spanmetrics]

🔸 Metrics

receivers: [hostmetrics, docker_stats, httpcheck, otlp, redis, spanmetrics]
processors: [memory_limiter, batch]
exporters: [otlphttp/prometheus, debug]

🔸 Logs

receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [opensearch, debug]

6. Telemetry for the Collector Itself

telemetry:
  metrics:
    level: detailed

This is meta-observability — the collector is monitoring itself and exporting its own health data.

Final Mental Model

Think of the Collector as a data router:

It listens to microservices, Docker, Redis, and the host.
It transforms and compresses data to reduce noise.
It sends this cleaned data to your chosen destinations for visualization and storage.

OpenTelemetry Collector Configuration

File: src/otel-collector/otelcol-config.yml
This is the main OTEL Collector configuration.

Key Points:

Receivers: Supports otlp (gRPC and HTTP), httpcheck, docker_stats, hostmetrics, redis
Exporters:
- otlp → sends traces to Jaeger
- otlphttp/prometheus → sends metrics to Prometheus
- opensearch → exports logs
Processors: batch, memory_limiter, transform
Pipelines:
- traces: receives OTLP, processes and exports to Jaeger + debug
- metrics: exports to Prometheus
- logs: exports to OpenSearch

It integrates well with Docker, Redis, Envoy, and Kubernetes environments.

STEP 3: Node.js SDK Initialization

File: src/payment/opentelemetry.js
This initializes OpenTelemetry SDK for the Node.js payment service.

Key Code Features:

Uses @opentelemetry/sdk-node with:
- OTLPTraceExporter for exporting traces
- OTLPMetricExporter + PeriodicExportingMetricReader for metrics
Includes multiple auto-instrumentations (e.g., filesystem, runtime, AWS, GCP, Docker, process, etc.)
Calls sdk.start() to initialize tracing + metric collection

This file is equivalent to your otel.js setup in a TODO app.

STEP 4: Prometheus Configuration

File: src/prometheus/prometheus-config.yaml

Highlights:

global:
  scrape_interval: 5s
  scrape_timeout: 3s

otlp:
  promote_resource_attributes:
    - service.name
    - service.instance.id
    ...

While this config focuses on OTLP attributes and TSDB behavior, you would add scrape targets here manually if needed, e.g.:

scrape_configs:
  - job_name: 'todo-app'
    static_configs:
      - targets: ['localhost:9464']

Summary Mapping

Step	Purpose	File Path
2	OTEL Collector setup	`src/otel-collector/otelcol-config.yml`
3	Node.js OTEL SDK initialization	`src/payment/opentelemetry.js`
4	Prometheus config for metric scraping	`src/prometheus/prometheus-config.yaml`

OpenTelemetry Collector Configuration

Why Use the OpenTelemetry Collector?

Core Components of OTel Collector

Example Use Case (Node.js App with Prometheus + Grafana + OTEL Collector)

Example Collector Config (YAML)

Collector Deployment Options

Variants

OpenTelemetry Collector Configuration_OTEL_demo files

What is This File?

1. Receivers – Who’s sending data?

More Receivers

`httpcheck/frontend-proxy`

`docker_stats`

`redis`

`hostmetrics`

2. Processors – What to do with the data?

`memory_limiter`

`batch`

`transform`

3. Exporters – Where to send data?

4. Connectors – Convert one type to another

5. Service Pipelines – The flowchart

🔸 Traces

🔸 Metrics

🔸 Logs

6. Telemetry for the Collector Itself

Final Mental Model

OpenTelemetry Collector Configuration

STEP 3: Node.js SDK Initialization

STEP 4: Prometheus Configuration

Summary Mapping

Subscribe to my newsletter

Md Nur Mohammad

Md Nur Mohammad