`docker_stats` errors
Bug Report
9b5478b307a44da1bd6a97154b878aed92d5e717
Symptom
When building and starting the demo I'm getting the following errors on the collector logs:
2024-07-22T09:26:55.491Z error [email protected]/docker.go:194 Could not parse docker containerStats for container id {"kind": "receiver", "name": "docker_stats", "data_type": "metrics", "id": "fa060dee854f4a1fc383b84ae1fc694080be8b0d207450f887ce78d8d44f723e", "error": "context canceled"}
github.com/open-telemetry/opentelemetry-collector-contrib/internal/docker.(*Client).toStatsJSON
github.com/open-telemetry/opentelemetry-collector-contrib/internal/[email protected]/docker.go:194
github.com/open-telemetry/opentelemetry-collector-contrib/internal/docker.(*Client).FetchContainerStatsAsJSON
github.com/open-telemetry/opentelemetry-collector-contrib/internal/[email protected]/docker.go:144
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/dockerstatsreceiver.(*metricsReceiver).scrapeV2.func1
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/[email protected]/receiver.go:92
2024-07-22T09:26:55.502Z error scraperhelper/scrapercontroller.go:197 Error scraping metrics {"kind": "receiver", "name": "docker_stats", "data_type": "metrics", "error": "context canceled", "scraper": "docker_stats"}
go.opentelemetry.io/collector/receiver/scraperhelper.(*controller).scrapeMetricsAndReport
go.opentelemetry.io/collector/[email protected]/scraperhelper/scrapercontroller.go:197
go.opentelemetry.io/collector/receiver/scraperhelper.(*controller).startScraping.func1
go.opentelemetry.io/collector/[email protected]/scraperhelper/scrapercontroller.go:173
Reproduce
Pull the latest changes and run:
docker compose build
docker compose up -d
docker logs otel-col -f
@rogercoll do you see this issue on your end as well?
@julianocosta89 Not on my end. I reckon the issue might be when the dockerstats receiver trying to retrieve the stats of your container with id fa060dee854f4a1fc383b84ae1fc694080be8b0d207450f887ce78d8d44f723e. What is the state of this container? (docker inspect fa060dee854f4a1fc383b84ae1fc694080be8b0d207450f887ce78d8d44f723e) Is it an opentelemetry-demo container or from another environment?
It is from the demo yes. I've re-ran the demo and checked the errors and it raises a bunch of different IDs, I've just pasted one.
Eg:
❯ docker inspect 85775
[
{
"Id": "857754a7442f5c1483e8e81b0ec363bc2cd0db9ada6ccc165b29b1f094e8dd96",
"Created": "2024-07-24T08:51:48.892932637Z",
"Path": "/app/shippingservice",
"Args": [],
"State": {
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 10350,
"ExitCode": 0,
"Error": "",
"StartedAt": "2024-07-24T08:51:50.169065304Z",
"FinishedAt": "0001-01-01T00:00:00Z"
},
"Image": "sha256:a16932aa56554f09a6abade21f418093021c0886dd3a4e04798a44d48f89058c",
"ResolvConfPath": "/var/lib/docker/containers/857754a7442f5c1483e8e81b0ec363bc2cd0db9ada6ccc165b29b1f094e8dd96/resolv.conf",
"HostnamePath": "/var/lib/docker/containers/857754a7442f5c1483e8e81b0ec363bc2cd0db9ada6ccc165b29b1f094e8dd96/hostname",
"HostsPath": "/var/lib/docker/containers/857754a7442f5c1483e8e81b0ec363bc2cd0db9ada6ccc165b29b1f094e8dd96/hosts",
"LogPath": "/var/lib/docker/containers/857754a7442f5c1483e8e81b0ec363bc2cd0db9ada6ccc165b29b1f094e8dd96/857754a7442f5c1483e8e81b0ec363bc2cd0db9ada6ccc165b29b1f094e8dd96-json.log",
"Name": "/shipping-service",
"RestartCount": 0,
"Driver": "overlay2",
"Platform": "linux",
"MountLabel": "",
"ProcessLabel": "",
"AppArmorProfile": "",
"ExecIDs": null,
"HostConfig": {
"Binds": null,
"ContainerIDFile": "",
"LogConfig": {
"Type": "json-file",
"Config": {
"max-file": "2",
"max-size": "5m",
"tag": "{{.Name}}"
}
},
"NetworkMode": "opentelemetry-demo",
"PortBindings": {
"50050/tcp": [
{
"HostIp": "0.0.0.0",
"HostPort": "0"
}
]
},
"RestartPolicy": {
"Name": "unless-stopped",
"MaximumRetryCount": 0
},
"AutoRemove": false,
"VolumeDriver": "",
"VolumesFrom": null,
"ConsoleSize": [
0,
0
],
"CapAdd": null,
"CapDrop": null,
"CgroupnsMode": "private",
"Dns": null,
"DnsOptions": null,
"DnsSearch": null,
"ExtraHosts": [],
"GroupAdd": null,
"IpcMode": "private",
"Cgroup": "",
"Links": null,
"OomScoreAdj": 0,
"PidMode": "",
"Privileged": false,
"PublishAllPorts": false,
"ReadonlyRootfs": false,
"SecurityOpt": null,
"UTSMode": "",
"UsernsMode": "",
"ShmSize": 67108864,
"Runtime": "runc",
"Isolation": "",
"CpuShares": 0,
"Memory": 20971520,
"NanoCpus": 0,
"CgroupParent": "",
"BlkioWeight": 0,
"BlkioWeightDevice": null,
"BlkioDeviceReadBps": null,
"BlkioDeviceWriteBps": null,
"BlkioDeviceReadIOps": null,
"BlkioDeviceWriteIOps": null,
"CpuPeriod": 0,
"CpuQuota": 0,
"CpuRealtimePeriod": 0,
"CpuRealtimeRuntime": 0,
"CpusetCpus": "",
"CpusetMems": "",
"Devices": null,
"DeviceCgroupRules": null,
"DeviceRequests": null,
"MemoryReservation": 0,
"MemorySwap": 41943040,
"MemorySwappiness": null,
"OomKillDisable": null,
"PidsLimit": null,
"Ulimits": null,
"CpuCount": 0,
"CpuPercent": 0,
"IOMaximumIOps": 0,
"IOMaximumBandwidth": 0,
"MaskedPaths": [
"/proc/asound",
"/proc/acpi",
"/proc/kcore",
"/proc/keys",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/proc/scsi",
"/sys/firmware",
"/sys/devices/virtual/powercap"
],
"ReadonlyPaths": [
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
]
},
"GraphDriver": {
"Data": {
"LowerDir": "/var/lib/docker/overlay2/f97a4a48255cc4d685e8111a8df66a295061997d5ef3d8f6b7d6fc5a28c05145-init/diff:/var/lib/docker/overlay2/vkoudxzjfg26swicotvfpwz07/diff:/var/lib/docker/overlay2/ld0z3iz6tax5ery1jv9dxrpp0/diff:/var/lib/docker/overlay2/4b4lfbidxgsoafuntz7lobpdy/diff:/var/lib/docker/overlay2/b4983cc9499d7fb98a4cfd14a33a2afbc0f58d12f14ae8c5e267d7b4b2cb0bf5/diff",
"MergedDir": "/var/lib/docker/overlay2/f97a4a48255cc4d685e8111a8df66a295061997d5ef3d8f6b7d6fc5a28c05145/merged",
"UpperDir": "/var/lib/docker/overlay2/f97a4a48255cc4d685e8111a8df66a295061997d5ef3d8f6b7d6fc5a28c05145/diff",
"WorkDir": "/var/lib/docker/overlay2/f97a4a48255cc4d685e8111a8df66a295061997d5ef3d8f6b7d6fc5a28c05145/work"
},
"Name": "overlay2"
},
"Mounts": [],
"Config": {
"Hostname": "857754a7442f",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": true,
"AttachStderr": true,
"ExposedPorts": {
"50050/tcp": {}
},
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": [
"QUOTE_SERVICE_ADDR=http://quoteservice:8090",
"OTEL_EXPORTER_OTLP_ENDPOINT=http://otelcol:4317",
"OTEL_RESOURCE_ATTRIBUTES=service.namespace=opentelemetry-demo,service.version=1.11.0",
"OTEL_SERVICE_NAME=shippingservice",
"SHIPPING_SERVICE_PORT=50050",
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
],
"Cmd": null,
"Image": "ghcr.io/open-telemetry/demo:latest-shippingservice",
"Volumes": null,
"WorkingDir": "/app",
"Entrypoint": [
"/app/shippingservice"
],
"OnBuild": null,
"Labels": {
"com.docker.compose.config-hash": "cc6d75eae8d68c2311f4f9e92debff17e0007ef15828f3f23fea72777d56f5d2",
"com.docker.compose.container-number": "1",
"com.docker.compose.depends_on": "otelcol:service_started:false",
"com.docker.compose.image": "sha256:a16932aa56554f09a6abade21f418093021c0886dd3a4e04798a44d48f89058c",
"com.docker.compose.oneoff": "False",
"com.docker.compose.project": "opentelemetry-demo",
"com.docker.compose.project.config_files": "/Users/juliano.costa/workspace/opentelemetry-demo/docker-compose.yml",
"com.docker.compose.project.environment_file": "/Users/juliano.costa/workspace/opentelemetry-demo/.env,/Users/juliano.costa/workspace/opentelemetry-demo/.env.override",
"com.docker.compose.project.working_dir": "/Users/juliano.costa/workspace/opentelemetry-demo",
"com.docker.compose.service": "shippingservice",
"com.docker.compose.version": "2.23.3"
}
},
"NetworkSettings": {
"Bridge": "",
"SandboxID": "66c1260db5448d30f85f4f7b16aff69a4aafc4442e0f76e635f71c385b6f6179",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {
"50050/tcp": [
{
"HostIp": "0.0.0.0",
"HostPort": "50511"
}
]
},
"SandboxKey": "/var/run/docker/netns/66c1260db544",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"MacAddress": "",
"Networks": {
"opentelemetry-demo": {
"IPAMConfig": null,
"Links": null,
"Aliases": [
"shipping-service",
"shippingservice",
"857754a7442f"
],
"MacAddress": "02:42:ac:12:00:0c",
"NetworkID": "263aa056fe48f70a72d21a193ec506c59d8d16d15a6d3dab3a11e894aef225bd",
"EndpointID": "05ad3f3671447b16cb1b605d8bc29cd223407a56983030b7f928c986e5eb3c80",
"Gateway": "172.18.0.1",
"IPAddress": "172.18.0.12",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"DriverOpts": null
}
}
}
}
]
But I've asked a friend that uses Ubuntu to try it out, and it is actually working fine there. It seems this is an issue with Docker on Mac 😞
It seems this is an issue with Docker on Mac 😞
I agree. Actually, it seems that Darwin is not supported by the receiver: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/dockerstatsreceiver#docker-stats-receiver
Many people develop on the Mac platform. It's important that we have a solution to disable the docker_stats receiver for them.
@puckpuck I've mentioned on the SIG meeting that the most fascinating thing is that I do get the docker metrics. I get the errors, but surprisingly, it works 🙃
@puckpuck I've mentioned on the SIG meeting that the most fascinating thing is that I do get the docker metrics. I get the errors, but surprisingly, it works 🙃
Same for me. I'm running otel collector as a container on mac and getting errors like this:
2025-01-21T09:20:23.199Z error [email protected]/obs_metrics.go:61 Error scraping metrics {"kind": "receiver", "name": "docker_stats", "data_type": "metrics", "scraper": "docker_stats", "error": "context canceled; context canceled; context canceled", "errorCauses": [{"error": "context canceled"}, {"error": "context canceled"}, {"error": "context canceled"}]} go.opentelemetry.io/collector/scraper/scraperhelper.newObsMetrics.func1 go.opentelemetry.io/collector/scraper/[email protected]/obs_metrics.go:61 go.opentelemetry.io/collector/scraper.ScrapeMetricsFunc.ScrapeMetrics go.opentelemetry.io/collector/[email protected]/metrics.go:24 go.opentelemetry.io/collector/scraper/scraperhelper.(*controller).scrapeMetricsAndReport go.opentelemetry.io/collector/scraper/[email protected]/scrapercontroller.go:195 go.opentelemetry.io/collector/scraper/scraperhelper.(*controller).startScraping.func1 go.opentelemetry.io/collector/scraper/[email protected]/scrapercontroller.go:178
But metrics are processed somehow.
I wish there was at least mute_errors flag, for this case.
I wonder if we should keep this issue opened here or keep it in https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34194.
Currently there are no active codeowners to the dockerstats receiver. I'm guessing the component will be deprecated soon.
My observation is a bit different in macOS; it works partially, a few containers do not show up in metrics, and they are
- shipping
- quote
- image-provider
I do not see them at all in container metrics; they are all missing.
Another observation, not sure if it is related or not, if I enable resource detectors for tracing like OTEL_RESOURCE_DETECTORS=os, container, these services do not fetch resources and are not available trace resource attributes.