for-win icon indicating copy to clipboard operation
for-win copied to clipboard

Memory Leak

Open knuclechan opened this issue 1 year ago • 46 comments

Description

Docker related processes eventually used up the computer's memory after running for a day or 2. I found out that it is related to the Page Table in memory by using RamMap. There's a huge amount (over a thousand) of docker conhost com.docker.cli cmd remains in the page table. Each of them eats up 36k memory.

I try close all the containers, docker desktop, wsl and all other applications, but the memory is not released at all.

I suspect there's a memory leak somewhere related to docker desktop

I am using Windows 10 Pro

Below is the memory usage:

圖片 圖片 All My memory is used up, but in process tab, I didn't use more than 2GB memory.

圖片 Unreasonable amount of memory is used by Page Table

圖片 This is the first page of processes sorted by pid. It shows there's a lot of docker conhost com.docker.cli cmd

圖片 There's way way way way more when I sorted them by name. Each of them eats up 36K Page Table memory. I don't list other process one by one.

Reproduce

Just run docker desktop for some days

Expected behavior

No response

docker version

Client:
 Cloud integration: v1.0.35+desktop.10
 Version:           25.0.3
 API version:       1.44
 Go version:        go1.21.6
 Git commit:        4debf41
 Built:             Tue Feb  6 21:13:02 2024
 OS/Arch:           windows/amd64
 Context:           default

Server: Docker Desktop 4.27.2 (137060)
 Engine:
  Version:          25.0.3
  API version:      1.44 (minimum version 1.24)
  Go version:       go1.21.6
  Git commit:       f417435
  Built:            Tue Feb  6 21:14:25 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.28
  GitCommit:        ae07eda36dd25f8a1b98dfbf587313b99c0190bb
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client:
 Version:    25.0.3
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.12.1-desktop.4
    Path:     C:\Program Files\Docker\cli-plugins\docker-buildx.exe
  compose: Docker Compose (Docker Inc.)
    Version:  v2.24.5-desktop.1
    Path:     C:\Program Files\Docker\cli-plugins\docker-compose.exe
  debug: Get a shell into any image or container. (Docker Inc.)
    Version:  0.0.24
    Path:     C:\Program Files\Docker\cli-plugins\docker-debug.exe
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.0
    Path:     C:\Program Files\Docker\cli-plugins\docker-dev.exe
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.21
    Path:     C:\Program Files\Docker\cli-plugins\docker-extension.exe
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
    Version:  v1.0.4
    Path:     C:\Program Files\Docker\cli-plugins\docker-feedback.exe
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v1.0.0
    Path:     C:\Program Files\Docker\cli-plugins\docker-init.exe
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     C:\Program Files\Docker\cli-plugins\docker-sbom.exe
  scout: Docker Scout (Docker Inc.)
    Version:  v1.4.1
    Path:     C:\Program Files\Docker\cli-plugins\docker-scout.exe

Server:
 Containers: 3
  Running: 3
  Paused: 0
  Stopped: 0
 Images: 3
 Server Version: 25.0.3
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: ae07eda36dd25f8a1b98dfbf587313b99c0190bb
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
 Kernel Version: 5.10.102.1-microsoft-standard-WSL2
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 214.6MiB
 Name: docker-desktop
 ID: 95e362d5-9aff-4023-9a84-15cf757cbb95
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support
WARNING: daemon is not using the default seccomp profile

Diagnostics ID

50A47C85-7750-446A-9865-A69828201F7D/20240223100956

Additional Info

No response

knuclechan avatar Feb 23 '24 10:02 knuclechan

i have the same issue since 2 months… im just using nginx proxy manager in docker. Win 11 Pro

Please inform me, if u find a solution :)

Vakium avatar Mar 07 '24 12:03 Vakium

we fixed a memory leak with 4.27.2 - can you confirm if the leak is present after restarting Docker Desktop?

bsousaa avatar Mar 07 '24 12:03 bsousaa

we fixed a memory leak with 4.27.2 - can you confirm if the leak is present after restarting Docker Desktop?

my currently version is 4.27.2 :/

Vakium avatar Mar 07 '24 12:03 Vakium

I've updated to v4.28 The leak continues to happen. But it seems that after I closed the containers and docker desktop, the leak cleans up itself. The page table's size reduced from 4G to 400mb I am not 100% sure yet. Will need a few days to observe.

Previously, the leak remains there after close the docker desktop. I need to restart the computer everyday

knuclechan avatar Mar 12 '24 11:03 knuclechan

v4.28.0 have the same problem image

image

Huge amount of com.docker.cli and docker.exe page table entries with PC uptime of a single day and half a day of containers running, stopping engine doesn't help the issue.

Zekfad avatar Mar 21 '24 17:03 Zekfad

Unsure if this is caused by the same issue or not, but I've had this issue with docker for some time, but it's so intermittent that by the time you try and catch it, the system crashes due to lack of memory. Normally I'm not interacting with, or using Docker at the time. I wasn't interacting with Docker this morning.

Docker Desktop (windows) v4.28.0, Ryzen 5 3600, 32gb RAM. WSL2.

First sign I noticed something was wrong, was Visual Studio complained about not having enough RAM to debug - quickly opened Task Manager and noticed (before it too, locked up) that I had 483 background processes. My baseline is 129. There was around 98% RAM usage.

I managed to terminate Chrome and Visual Studio, which gave me enough memory back that Task Manager would load properly (TIL it has a "low memory" mode!). I then managed to quit Docker Desktop by using the tray icon - and after what seemed like an age - it quit, and memory usage dropped to around 30%.

Apologies for the rubbish images, the screenshot utility wasn't working and the computer was on the verge of crashing! I believe I have a copy of the logs directory if you can tell me what you need.

Edit: To add, I'm pretty sure I had no containers running at the time.

Screenshot 2024-04-12 060150 Screenshot 2024-04-12 060205

sparxooo avatar Apr 12 '24 05:04 sparxooo

Hi folks, this is Cesar Talledo, I am developer at Docker. Thanks for finding the issue and apologies for the inconvenience it's caused and for the belated response.

From the problem description above, it's clear there's a bug somewhere that is causing Docker Desktop (or some other program) to unnecessarily spawn lots of com.docker.cli processes, which over time are eating up host resources (e.g., memory).

However, I tried reproducing on a Windows host with Docker Desktop 4.27.2 and 4.28 (WSL), but haven't been able to do so. So there must be something different between your environment and mine that's triggering the bug.

@sparxooo: if possible, could you capture and upload a Docker Desktop diagnostics bundle so I can get more info about your environment? It would be best if you can capture the bundle as you start seeing the number of com.docker.cli processes growing without reason, but before your machine runs out of resources.

@knuclechan: thanks for capturing the diagnostics bundle for Docker Desktop. Unfortunately because of my belated response to this, the bundle is no longer available :(

Also, any other info you can provide regarding Docker Desktop config would be helpful; e.g.,:

  1. Does the problem reproduce consistently?

  2. Do you have any Docker Desktop extensions installed?

  3. Is Docker Desktop's resource saver on when the problem occurs?

  4. Is Docker Desktop integrated with your WSL distro (Settings->Resources->WSL integration)?

  5. Do you have VS-Code + Docker extension installed? If so, what happens if you close VS-Code?

Thanks!

ctalledo avatar Apr 15 '24 22:04 ctalledo

In case the problem is related to the interaction of docker stats --no-stream with Docker Desktop's resource saver mode, here's an ** unofficial ** Docker Desktop 4.30 pre-release build that has a fix:

https://desktop-stage.docker.com/win/main/amd64/146335/Docker%20Desktop%20Installer.exe

If possible, please check if the problem reproduces with this build or not.

Thanks!

ctalledo avatar Apr 15 '24 23:04 ctalledo

Stumbled upon this prerelease by chance and it fixes most the memory issues I had the past few weeks. 👍🏻

luastoned avatar Apr 16 '24 09:04 luastoned

Stumbled upon this prerelease by chance and it fixes most the memory issues I had the past few weeks. 👍🏻

Thanks @luastoned for trying it out. Do you know if the memory issues you had were related to lots of com.docker.cli processes eating up host resources? (see comments above).

ctalledo avatar Apr 16 '24 15:04 ctalledo

@ctalledo RAMMap had my page table at ~5GB before the fix, now it's closer to ~1GB. com.docker.cli, conhost.exe and docker.exe still take up 90% of the list in the Processes tab though.

I did notice that vmmem steadily consumes up to max memory allowed via .wslconfig, freeing is only possible via empty working sets in RAMMap..

luastoned avatar Apr 16 '24 15:04 luastoned

Thanks @luastoned.

com.docker.cli, conhost.exe and docker.exe still take up 90% of the list in the Processes tab though.

Mmm ... so you are still seeing lots of com.docker.cli processes with the DD 4.30 pre-release build I posted above?

I did notice that vmmem steadily consumes up to max memory allowed via .wslconfig, freeing is only possible via empty working sets in RAMMap..

That could be because the Linux kernel inside the WSL VM steadily fills up it's kernel buffer cache, without releasing the associated memory back to the host (because it does not know it's running in a VM). Have you tried enabling the autoMemoryReclaim feature in WSL?

autoMemoryReclaim	string	disabled	Automatically releases cached memory after detecting idle CPU usage. Set to gradual for slow release, and dropcache for instant release of cached memory.

If not, try setting it to gradual and see if that helps.

ctalledo avatar Apr 16 '24 18:04 ctalledo

My current DD version is 4.30.0 (146335)

After starting my machine this morning I dumped the process list in RAMMap (~5 minutes uptime): 45k process list entries, 42k are docker.exe, com.docker.cli and conhost.exe

I do have autoMemoryReclaim set to dropcache but I'll try gradual now.

luastoned avatar Apr 17 '24 08:04 luastoned

Thanks @luastoned .

45k process list entries, 42k are docker.exe, com.docker.cli

Mmm ... let me see if I can repro with RAMMap.

I do have autoMemoryReclaim set to dropcache but I'll try gradual now.

I don't think that will make a difference unfortunately.

ctalledo avatar Apr 17 '24 17:04 ctalledo

Hi @luastoned,

45k process list entries, 42k are docker.exe, com.docker.cli

I tried reproducing by starting DD 4.30.0 (146335) on my Windows hosts and leaving it idle for 5 minutes, then checked RAMMap. However I am not able to repro. I only see a handful of Docker Desktop and com.docker.* processes.

So there's something different in your environment that must be spawning all those com.docker.cli processes.

Could you upload a Docker Desktop diagnostics bundle please?

Also, what Docker Desktop extensions (if any) do you have installed? And do you have any VS-Code extensions that could be invoking Docker commands?

Thanks again!

ctalledo avatar Apr 17 '24 17:04 ctalledo

Hi @ctalledo

I did more digging into that issue and it seems a recent update modified Windows power plans, which turned fast boot on again (ie. not clearing the page table, etc). After solving this there are no more docker related entries after a reboot, so I will monitor the growth/usage today and update this message later on.

Well, after a day of work the page table process count is ~52k and docker related things add up to 45k :(

luastoned avatar Apr 18 '24 09:04 luastoned

Hi @ctalledo, apologies for not responding earlier. I will try and get a diagnostics bundle sorted if it does it again - but it's very intermittent.

Do you wish me to update to the v4.30 version or remain on the version i'm on, just in case I can repro it?

Cheers,

sparxooo avatar Apr 18 '24 18:04 sparxooo

Thanks @sparxooo for the response.

Do you wish me to update to the v4.30 version or remain on the version i'm on, just in case I can repro it?

If you get a chance to reproduce the "lots of com.docker.cli processes" and capture the Docker Desktop diagnostics bundle before the machine runs out of memory, that would be amazing. And if that's the case, try the v4.30 version I added and check if that resolves the issue please.

Thanks for all the help!

ctalledo avatar Apr 18 '24 19:04 ctalledo

Hello @ctalledo ,

Facing the similar issue with Docker Desktop 4.29.0 (145265).

As you pointed to some extension, I disabled VS Code extension of Docker. The call to docker cli got reduced, nearly half. Then I closed VS Code before 2 hours. The call to docker cli every few seconds has stopped. Zero call to docker and conhost too.

I will try opening VS Code and monitor again. And if I can create capture as suggested in previous post, I will do and post here.

jd4u avatar Apr 19 '24 08:04 jd4u

@ctalledo,

Local diagnostic run says "[FAIL] DD0004: is the Docker engine running? unable to create docker client: protocol not available

But 3 containers are already running.

Screenshot 2024-04-19 141628

jd4u avatar Apr 19 '24 08:04 jd4u

@ctalledo Here are the diagnostic ids

1] 7BE15776-926F-4950-8DA6-9CF88046EC6C/20240419085152 created before 5 hours appx. [RAM used was around 8GB] 2] 7BE15776-926F-4950-8DA6-9CF88046EC6C/20240419135801 created just now [RAM used is 14.3GB]

Here is the RamMap saved for your review.

jd4u avatar Apr 19 '24 14:04 jd4u

Hi @jd4u,

Thank you very much for the additional info.

As you pointed to some extension, I disabled VS Code extension of Docker. The call to docker cli got reduced, nearly half. Then I closed VS Code before 2 hours. The call to docker cli every few seconds has stopped. Zero call to docker and conhost too.

OK so that means the problem is likely caused by an interaction between VSCode and Docker Desktop.

Would you mind trying with the 4.30 prelease build I posted above?

Local diagnostic run says "[FAIL] DD0004: is the Docker engine running? unable to create docker client: protocol not available

Thanks for capturing the diagnostics, and please ignore that error (it's a bug in the diagnostics gathering that will be fixed in the next Docker Desktop release).

ctalledo avatar Apr 19 '24 17:04 ctalledo

@ctalledo ,

There is no difference even with 4.30.0. The machine is restarted and initial ram usage was 3.6GB. Now 10+GB visible in screenshot. VS Code, Docker Desktop, Chrome and explorer are only programs running.

In 40 minutes there is a huge list of com.docker.cli, conhost, cmd, docker and relatively smaller list of wsl in RamMap

Screenshot 2024-04-20 124023

jd4u avatar Apr 20 '24 07:04 jd4u

Hi @jd4u,

There is no difference even with 4.30.0. The machine is restarted and initial ram usage was 3.6GB. Now 10+GB visible in screenshot. VS Code, Docker Desktop, Chrome and explorer are only programs running.

In 40 minutes there is a huge list of com.docker.cli, conhost, cmd, docker and relatively smaller list of wsl in RamMap

Thanks for trying that experiment, I appreciate it.

So this means the problem is triggered by the VS-Code Docker Desktop Extension, and likely not related to Docker Desktop's resource saver mode. Let me try harder to reproduce locally so I can further debug.

Thanks again for all the help!

ctalledo avatar Apr 22 '24 16:04 ctalledo

While using 4.30.0 with VS Code, the VS Code Docker Extension is disabled. The Docker extension is from Microsoft with version 1.29.0 .

(To review further, today stopped keeping open the terminal login session that started all containers.)

image

jd4u avatar Apr 22 '24 17:04 jd4u

Sorry for the delay getting back to you.

Today at approx 1630-1635 (GMT+1), Docker was in "Resource Saver" mode, and the engine was paused. I spotted the influx of "docker" --stats processes with the increase in memory and sluggishness of the computer and also Task Manager. Generating a diagnostic bundle appeared to kill all of the "docker" --stats processes... and resumed the engine.

VS Code was not running at the time, only Chrome and Word.

As said previously, I'm still on v4.28 so I will now update to v4.30 and see if the problem reoccurs.

Diag id: 4E2EBDAD-12B0-4893-8641-E1727F59DB93/20240424153541

sparxooo avatar Apr 24 '24 15:04 sparxooo

Even with "Resource Saver" mode is disabled, the issue persists...!!! (Docker Desktop v. 4.30.0, VS Code, VS Code Docker Extension Disabled, Terminal WSL2 Linux Session On)

jd4u avatar Apr 25 '24 17:04 jd4u

Thanks @sparxooo and @jd4u for the additional info, very helpful. I am still debugging it but I agree it's not related to Docker Desktop's resource saver feature.

@jd4u : can you capture and upload a Docker Desktop diagnostics bundle please? That will really help us root-cause since I can't reproduce locally yet.

Also, when the problem occurs:

  1. What happens if you run docker run --all --no-trunc --no-stream from a terminal? Does it work correctly?

  2. What happens if you stop all containers? e.g., docker stop -t0 $(docker ps -aq) followed by docker rm $(docker ps -aq)

Thanks again!

ctalledo avatar Apr 25 '24 18:04 ctalledo

7BE15776-926F-4950-8DA6-9CF88046EC6C/20240426143507

image

image

jd4u avatar Apr 26 '24 14:04 jd4u

Following are result of

  1. What happens if you stop all containers? e.g., docker stop -t0 $(docker ps -aq) followed by docker rm $(docker ps -aq)

image

  1. What happens if you run docker run --all --no-trunc --no-stream from a terminal? Does it work correctly?

image

--no-trunc, --all, --no-stream: all these are unknown flags

jd4u avatar Apr 26 '24 14:04 jd4u