OpenTelemetry in 2026: The Observability Standard That Finally Stuck
on Opentelemetry, Observability, Distributed tracing, Devops, Monitoring, 2026
The Observability Fragmentation Problem (Past Tense)
Three years ago, the observability landscape was a mess. You had Jaeger for tracing, Prometheus for metrics, and Elasticsearch for logs — each with its own SDK, its own instrumentation approach, and its own data model. Switching vendors meant rewriting instrumentation. Running multiple backends meant maintaining multiple agents.
OpenTelemetry solved this. And by 2026, it has solved it so thoroughly that the question is no longer “should I use OTel?” but “how do I use it well?”
Photo by Carlos Muza on Unsplash
What OpenTelemetry Actually Is
OTel is three things:
- A specification — defining what traces, metrics, and logs look like
- A set of SDKs — one per language, all following the same API
- The Collector — a vendor-agnostic pipeline agent for receiving, processing, and exporting telemetry
The key insight is separation of instrumentation from export. You instrument your code once using OTel’s API. Where that data goes — Datadog, Honeycomb, Grafana Cloud, Jaeger, your own Prometheus — is a deployment configuration, not a code change.
The Three Signals: Status in 2026
| Signal | API/SDK Status | Notes |
|---|---|---|
| Traces | Stable | Production-ready since 2021 |
| Metrics | Stable | Graduated stable in 2023 |
| Logs | Stable | Finally GA in 2024 |
| Profiles | Beta | Continuous profiling, expected GA Q4 2026 |
| Events | Experimental | Structured log events with semantic conventions |
All four core signals are stable. Profiles (continuous profiling data) is the exciting frontier — it will let you correlate performance profiles directly with traces, answering “why was this trace slow?” with actual flame graphs.
Instrumenting a Node.js Service
Here’s the 2026 way to instrument a Node.js Express app with zero-code instrumentation:
// otel.js - load BEFORE anything else
import { NodeSDK } from '@opentelemetry/sdk-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http';
import { PeriodicExportingMetricReader } from '@opentelemetry/sdk-metrics';
const sdk = new NodeSDK({
serviceName: 'user-service',
traceExporter: new OTLPTraceExporter({
url: 'http://otel-collector:4318/v1/traces',
}),
metricReader: new PeriodicExportingMetricReader({
exporter: new OTLPMetricExporter({
url: 'http://otel-collector:4318/v1/metrics',
}),
exportIntervalMillis: 30000,
}),
instrumentations: [
getNodeAutoInstrumentations({
'@opentelemetry/instrumentation-http': { enabled: true },
'@opentelemetry/instrumentation-express': { enabled: true },
'@opentelemetry/instrumentation-pg': { enabled: true },
'@opentelemetry/instrumentation-redis': { enabled: true },
}),
],
});
sdk.start();
// index.js
import './otel.js'; // Must be first
import express from 'express';
// ... your normal app code, no OTel changes needed
Auto-instrumentation handles HTTP spans, database queries, and Redis calls automatically. No manual startSpan() calls needed for common operations.
Adding Custom Instrumentation
Auto-instrumentation gets you 80% of the way. For business-specific operations, add manual spans:
import { trace, metrics } from '@opentelemetry/api';
const tracer = trace.getTracer('user-service');
const meter = metrics.getMeter('user-service');
// Custom metric
const orderCounter = meter.createCounter('orders.created', {
description: 'Number of orders created',
});
const orderValueHistogram = meter.createHistogram('orders.value', {
description: 'Order value in USD',
unit: 'USD',
});
async function createOrder(userId: string, items: OrderItem[]) {
return tracer.startActiveSpan('order.create', async (span) => {
span.setAttributes({
'user.id': userId,
'order.item_count': items.length,
});
try {
const order = await db.orders.create({ userId, items });
const value = items.reduce((sum, i) => sum + i.price, 0);
// Record metrics
orderCounter.add(1, { 'order.status': 'success' });
orderValueHistogram.record(value, { 'user.tier': user.tier });
span.setStatus({ code: SpanStatusCode.OK });
return order;
} catch (error) {
span.recordException(error);
span.setStatus({ code: SpanStatusCode.ERROR });
orderCounter.add(1, { 'order.status': 'error' });
throw error;
} finally {
span.end();
}
});
}
The OTel Collector: Your Observability Pipeline
The Collector is where the real power lives. It’s a standalone service that receives telemetry, applies transformations, and routes to multiple backends.
# collector-config.yaml
receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
grpc:
endpoint: 0.0.0.0:4317
# Also scrape Prometheus endpoints
prometheus:
config:
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
processors:
# Drop high-cardinality spans from health checks
filter:
traces:
span:
- 'attributes["http.route"] == "/health"'
# Add resource attributes to everything
resource:
attributes:
- key: deployment.environment
value: production
action: upsert
# Smart tail-based sampling
tail_sampling:
decision_wait: 10s
policies:
- name: errors-policy
type: status_code
status_code: { status_codes: [ERROR] }
- name: slow-traces-policy
type: latency
latency: { threshold_ms: 1000 }
- name: sample-10pct
type: probabilistic
probabilistic: { sampling_percentage: 10 }
batch:
timeout: 5s
send_batch_size: 1000
exporters:
# Primary: Grafana Cloud
otlp/grafana:
endpoint: https://otlp-gateway-prod-us-central-0.grafana.net:443
headers:
authorization: "Basic ${GRAFANA_TOKEN}"
# Secondary: Keep traces in-house with Tempo
otlp/tempo:
endpoint: http://tempo:4317
# Metrics to Prometheus
prometheus:
endpoint: 0.0.0.0:9090
service:
pipelines:
traces:
receivers: [otlp]
processors: [filter, resource, tail_sampling, batch]
exporters: [otlp/grafana, otlp/tempo]
metrics:
receivers: [otlp, prometheus]
processors: [resource, batch]
exporters: [otlp/grafana, prometheus]
logs:
receivers: [otlp]
processors: [resource, batch]
exporters: [otlp/grafana]
This config: filters health check noise, adds environment context, applies intelligent sampling (keeping all errors and slow traces, 10% sampling otherwise), and fans out to multiple backends.
Semantic Conventions: The Hidden Secret
OTel’s real value beyond the SDK is semantic conventions — standardized attribute names for common operations. Instead of everyone calling the same thing http.url, request_url, url, and req.url, there’s one standard:
http.request.method: GET
url.full: https://api.example.com/users/123
http.response.status_code: 200
server.address: api.example.com
The OTel semantic conventions spec now covers HTTP, databases, messaging, RPC, cloud providers, Kubernetes, and more. When your spans follow conventions, your observability vendor can build dashboards and alerts without you doing any configuration.
OTel and AI Observability
A fast-emerging area in 2026: observability for LLM applications. The OTel working group has drafted semantic conventions for GenAI:
span.setAttributes({
'gen_ai.system': 'openai',
'gen_ai.request.model': 'gpt-4o',
'gen_ai.request.max_tokens': 1000,
'gen_ai.response.model': 'gpt-4o-2024-11-20',
'gen_ai.usage.input_tokens': 256,
'gen_ai.usage.output_tokens': 512,
'gen_ai.usage.total_tokens': 768,
});
Tools like LangFuse, Arize Phoenix, and Traceloop build on top of OTel to provide LLM-specific observability — prompt tracing, token cost tracking, and evaluation pipelines — all feeding into your existing OTel infrastructure.
Getting Started in One Hour
- Add SDK to your service (pick your language from opentelemetry.io)
- Run the Collector locally with Docker:
docker run -p 4318:4318 otel/opentelemetry-collector-contrib - Point it at Jaeger for visualization (also runs in Docker)
- Watch your first trace appear — it’ll take your breath away the first time
The ecosystem has matured to the point where you don’t need to be an observability expert to get started. Start simple, add signals as you understand your needs.
If you’re still running separate instrumentation for each backend, it’s time to consolidate. OTel isn’t just a nice-to-have in 2026 — it’s the foundation every modern observability stack is built on.
이 글이 도움이 되셨다면 공감 및 광고 클릭을 부탁드립니다 :)
