cli icon indicating copy to clipboard operation
cli copied to clipboard

docker stats CPU above 100%

Open cdalexndr opened this issue 4 years ago • 13 comments

Description docker stats CPU shows values above 100%.

Steps to reproduce the issue:

  1. run docker stats while a container is using high cpu

Describe the results you received: CPU column shows values above 100% (110%, 250%...)

Describe the results you expected: CPU column values should be normalized to 100%. Conceptually, header CPU % means max 100%.

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

Client: Docker Engine - Community
 Version:           19.03.2
 API version:       1.40
 Go version:        go1.12.8
 Git commit:        6a30dfc
 Built:             Thu Aug 29 05:26:49 2019
 OS/Arch:           windows/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.2
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.8
  Git commit:       6a30dfc
  Built:            Thu Aug 29 05:32:21 2019
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.2.6
  GitCommit:        894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc:
  Version:          1.0.0-rc8
  GitCommit:        425e105d5a03fabd737a126ad93d62a9eeede87f
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

Output of docker info:

Client:
 Debug Mode: false

Server:
 Containers: 11
  Running: 11
  Paused: 0
  Stopped: 0
 Images: 251
 Server Version: 19.03.2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.9.184-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 3.837GiB
 Name: docker-desktop
 ID: PL7Z:37ZA:FGN5:EBPE:KYFT:HSFI:YXJP:2MOK:ESB3:MML3:6G22:7ZIK
 Docker Root Dir: /var/lib/docker
 Debug Mode: true
  File Descriptors: 143
  Goroutines: 152
  System Time: 2019-10-11T20:46:04.5961211Z
  EventsListeners: 4
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

Additional environment details (AWS, VirtualBox, physical, etc.): Using 4 core CPU.

cdalexndr avatar Oct 11 '19 20:10 cdalexndr

This is expected behavior on multi-core host

AkihiroSuda avatar Oct 14 '19 07:10 AkihiroSuda

I'm having a similar issue with geoserver 2.14.3 , but I can't figure it out.

ibrahimawadhamid avatar Nov 25 '19 08:11 ibrahimawadhamid

This is not an issue. If you have N CPU cores, the CPU usage can be up to N * 100%.

AkihiroSuda avatar Nov 25 '19 08:11 AkihiroSuda

I don't think there is a system monitor tool (linux or windows) that shows CPU usage above 100%. Conceptually, 100 percent means maximum value.

cdalexndr avatar Nov 25 '19 09:11 cdalexndr

The main issue is that geoserver is consuming all of these resource when nothing is being processed at all! not even accessing it through the web. Shouldn't geoserver in his idle state just be consuming memory and not a lot of CPU ?!

ibrahimawadhamid avatar Nov 25 '19 09:11 ibrahimawadhamid

I don't think there is a system monitor tool (linux or windows) that shows CPU usage above 100%.

systemd-cgtop

Shouldn't geoserver in his idle state just be consuming memory and not a lot of CPU ?!

Seems an issue on geoserver.

AkihiroSuda avatar Nov 25 '19 09:11 AkihiroSuda

Cpu % is calculated using deltas of the total_usage as per https://github.com/docker/cli/blob/6c12a82f330675d4e2cfff4f8b89a353bcb1fecd/cli/command/container/stats_helpers.go#L180

Here the ratio is multiplied by the number of CPUs. However, cpuDelta already includes usage across all CPUs (see below). Adding all values in percpu_usage gives total_usage (which is used to derive cpuDelta).

        "cpu_stats": {
            "cpu_usage": {
                "percpu_usage": [
                    826860687,
                    830807540,
                    823365887,
                    844077056
                ],
                "total_usage": 3325111170,
                "usage_in_kernelmode": 1620000000,
                "usage_in_usermode": 1600000000
            },
            "online_cpus": 4,
            "system_cpu_usage": 35595977360000000,
            "throttling_data": {
                "periods": 0,
                "throttled_periods": 0,
                "throttled_time": 0
            }
        },

So what is the reason to multiply by the number of cpus? Seem that is not required because total_usage already accounts for them.

raags avatar Nov 26 '19 10:11 raags

I found and resolved the issue. I had my health check in my docker-compose.yml configured to try every 2 seconds. I changed it to 3 minutes and now everything is fine.

ibrahimawadhamid avatar Nov 26 '19 11:11 ibrahimawadhamid

Ok, but the logic above doesn't seem right. @AkihiroSuda could you please review? I couldn't find anything in git blame.

Just adding that this may not be related to the original issue - I can open a new issue if required.

raags avatar Nov 27 '19 04:11 raags

Looks strange to me, as I need to google why my containers get above 100% and do math calculations with cores.

Maybe better to have there two things, one is common cpu usage with max 100% and other there it's like load for cores

Hronom avatar Sep 30 '20 19:09 Hronom

We're trying to figure out if our container is using CPUs in a healthy way. Can someone clarify how we can do this on a multicore machine? For instance, If i'm understanding correctly, if you have 4 cores and 100% CPU usage, then that's either the 4 cores running at 25% each OR 1 core running at 100%? The former seems "healthy" while the latter is potentially a problem (at least in our use case).

frankandrobot avatar May 10 '22 17:05 frankandrobot

@frankandrobot you can connect to the API endpoint to get the raw information ; https://docs.docker.com/engine/api/v1.41/#operation/ContainerStats

thaJeztah avatar May 10 '22 23:05 thaJeztah