Understanding the OpenTelemetry Collector configuration of openobserve-collector
tl;dr the values.yaml of openobserve-collector is over-complicated. A simpler solution can be achieved using the upstream OpenTelemetry collector's chart.
I am reviewing the code of the openobserve-collector and would like to ask some questions about how it works.
Currently I'm running a Kubernetes cluster with OpenObserve deployed in the monitoring namespace. Instead of using the openobserve-collector chart, I am using the upstream OpenTelemetry collector's chart with presets enabled. The setup can be achieved with a relatively concise helmfile:
repositories:
- name: open-telemetry
url: https://open-telemetry.github.io/opentelemetry-helm-charts
releases:
- name: collector-agent
namespace: monitoring
chart: open-telemetry/opentelemetry-collector
version: 0.111.2
values:
- image:
repository: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-k8s
mode: daemonset
presets:
logsCollection:
enabled: true
hostMetrics:
enabled: true
kubernetesAttributes:
enabled: true
extractAllPodLabels: true
extractAllPodAnnotations: false
kubeletMetrics:
enabled: true
config: &CONFIG
receivers:
kubeletstats:
insecure_skip_verify: true
exporters:
otlp/openobserve:
endpoint: http://openobserve.monitoring.svc:5081
headers:
Authorization: {{
printf "%s:%s"
(fetchSecretValue "ref+k8s://v1/Secret/monitoring/openobserve-root-user/ZO_ROOT_USER_EMAIL")
(fetchSecretValue "ref+k8s://v1/Secret/monitoring/openobserve-root-user/ZO_ROOT_USER_PASSWORD")
| b64enc | print "Basic " | quote
}}
organization: default
stream-name: default
tls:
insecure: true
service:
pipelines:
logs:
exporters:
- otlp/openobserve
metrics:
exporters:
- otlp/openobserve
traces:
exporters:
- otlp/openobserve
resources: {} # -- snip --
- name: collector-cluster
namespace: monitoring
chart: open-telemetry/opentelemetry-collector
version: 0.111.2
values:
- image:
repository: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-k8s
mode: deployment
replicaCount: 1
presets:
clusterMetrics:
enabled: true
kubernetesEvents:
enabled: true
config: *CONFIG
resources: {} # -- snip --
The helmfile.yaml defines two releases. The one called collector-agent handles log ingestion. The generated collector config is obtained with the command:
kubectl get -n monitoring configmap collector-agent-opentelemetry-collector-agent -o jsonpath='{.data.relay}'
Upstream OpenTelemetry collector generated configuration
exporters:
debug: {}
otlp/openobserve:
endpoint: http://openobserve.monitoring.svc:5081
headers:
Authorization: Basic ZGV2QGJhYnltcmkub3JnOmNocmlzMTIzNA==
organization: default
stream-name: otel-chart
tls:
insecure: true
extensions:
health_check:
endpoint: ${env:MY_POD_IP}:13133
processors:
batch: {}
k8sattributes:
extract:
labels:
- from: pod
key_regex: (.*)
tag_name: $$1
metadata:
- k8s.namespace.name
- k8s.deployment.name
- k8s.statefulset.name
- k8s.daemonset.name
- k8s.cronjob.name
- k8s.job.name
- k8s.node.name
- k8s.pod.name
- k8s.pod.uid
- k8s.pod.start_time
filter:
node_from_env_var: K8S_NODE_NAME
passthrough: false
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: connection
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
receivers:
filelog:
exclude:
- /var/log/pods/monitoring_collector-agent-opentelemetry-collector*_*/opentelemetry-collector/*.log
include:
- /var/log/pods/*/*/*.log
include_file_name: false
include_file_path: true
operators:
- id: container-parser
max_log_size: 102400
type: container
retry_on_failure:
enabled: true
start_at: end
hostmetrics:
collection_interval: 10s
root_path: /hostfs
scrapers:
cpu: null
disk: null
filesystem:
exclude_fs_types:
fs_types:
- autofs
- binfmt_misc
- bpf
- cgroup2
- configfs
- debugfs
- devpts
- devtmpfs
- fusectl
- hugetlbfs
- iso9660
- mqueue
- nsfs
- overlay
- proc
- procfs
- pstore
- rpc_pipefs
- securityfs
- selinuxfs
- squashfs
- sysfs
- tracefs
match_type: strict
exclude_mount_points:
match_type: regexp
mount_points:
- /dev/*
- /proc/*
- /sys/*
- /run/k3s/containerd/*
- /var/lib/docker/*
- /var/lib/kubelet/*
- /snap/*
load: null
memory: null
network: null
jaeger:
protocols:
grpc:
endpoint: ${env:MY_POD_IP}:14250
thrift_compact:
endpoint: ${env:MY_POD_IP}:6831
thrift_http:
endpoint: ${env:MY_POD_IP}:14268
kubeletstats:
auth_type: serviceAccount
collection_interval: 20s
endpoint: ${env:K8S_NODE_IP}:10250
insecure_skip_verify: true
otlp:
protocols:
grpc:
endpoint: ${env:MY_POD_IP}:4317
http:
endpoint: ${env:MY_POD_IP}:4318
prometheus:
config:
scrape_configs:
- job_name: opentelemetry-collector
scrape_interval: 10s
static_configs:
- targets:
- ${env:MY_POD_IP}:8888
zipkin:
endpoint: ${env:MY_POD_IP}:9411
service:
extensions:
- health_check
pipelines:
logs:
exporters:
- otlp/openobserve
processors:
- k8sattributes
- memory_limiter
- batch
receivers:
- otlp
- filelog
metrics:
exporters:
- otlp/openobserve
processors:
- k8sattributes
- memory_limiter
- batch
receivers:
- otlp
- prometheus
- hostmetrics
- kubeletstats
traces:
exporters:
- otlp/openobserve
processors:
- k8sattributes
- memory_limiter
- batch
receivers:
- otlp
- jaeger
- zipkin
telemetry:
metrics:
address: ${env:MY_POD_IP}:8888
Here is an example log entry from OpenObserve using the above upstream OpenTelemetry collector chart:
{
"_timestamp": 1737473264323746,
"app": "openobserve",
"apps_kubernetes_io_pod_index": "0",
"body": "2025-01-21T15:27:44.323513488+00:00 INFO actix_web::middleware::logger: 172.18.0.4 \"GET /api/default/otel_chart/_values?fields=k8s_container_name&size=10&start_time=1737472364215000&end_time=1737473264215000&sql=U0VMRUNUICogRlJPTSAib3RlbF9jaGFydCIg&type=logs HTTP/1.1\" 200 250 \"-\" \"http://localhost:32020/web/logs?stream_type=logs&stream=otel_chart&period=15m&refresh=0&sql_mode=false&query=YXBwX2t1YmVybmV0ZXNfaW9fbmFtZSA9ICdjaHJpcy13b3JrZXItbWFpbnMn&type=stream_explorer&defined_schemas=user_defined_schema&org_identifier=default&quick_mode=false&show_histogram=true\" \"Mozilla/5.0 (X11; Linux x86_64; rv:134.0) Gecko/20100101 Firefox/134.0\" 0.099962",
"controller_revision_hash": "openobserve-69f6d688f6",
"dropped_attributes_count": 0,
"k8s_container_name": "openobserve",
"k8s_container_restart_count": "1",
"k8s_namespace_name": "monitoring",
"k8s_node_name": "khris-worker",
"k8s_pod_name": "openobserve-0",
"k8s_pod_start_time": "2025-01-20T22:14:56Z",
"k8s_pod_uid": "1c857c0a-066e-40ba-8676-6c874631f1ca",
"k8s_statefulset_name": "openobserve",
"log_file_path": "/var/log/pods/monitoring_openobserve-0_1c857c0a-066e-40ba-8676-6c874631f1ca/openobserve/1.log",
"log_iostream": "stdout",
"logtag": "F",
"name": "openobserve",
"severity": 0,
"statefulset_kubernetes_io_pod_name": "openobserve-0"
}
Meanwhile, openobserve-collector's default values.yaml specifies complex routing and regular expression named capture groups to extract metadata from log file names:
https://github.com/openobserve/openobserve-helm-chart/blob/b146f802fbd5305c00eba85dc8fa8683680ae3dc/charts/openobserve-collector/values.yaml#L130-L170
Seeing that the upstream's config can produce logs with the metadata k8s_pod_name, k8s_namespace_name, etc. (via the k8sattributes processor) with a simpler config, why does openobserve-collector's values.yaml have these regexes?
why does openobserve-collector's values.yaml have these regexes?
There is no reason why this could not be updated. The collector helm chart replaced this configuration once the filelogreceiver started making use of the new container-parser stanza operator with the collector 102 release (this chart defaults to 113).
I've also been in and out of the helm chart recently as it's out of date with things; would appreciate some movement on this