cli
cli copied to clipboard
Accessing remote server via DOCKER_HOST eats all memory
Accessing remote server via SSH and running command eats all the memory. Using the same command in server itself has no problem.
For instance,
I have a docker compose file in my local, if I run the command below, it eats all the memory and server shuts down.
DOCKER_HOST=ssh://blabla docker compose up
but, if I copy the same compose file to server and run the docker compose up
command only uses ~50MB memory.
Can you provide more details, otherwise this may be difficult to look into;
- can you provide the output of
docker version
- can you provide the output of
DOCKER_HOST=ssh://blabla docker info
- if your local machine is running macOS or Windows and have Docker Desktop installed, does the problem also reproduce if you use
DOCKER_HOST=ssh://blabla com.docker.cli compose up
(so usingcom.docker.cli
instead ofdocker
?) - does the problem reproduce if you call the docker compose component directly (in "standalone" mode)? you can do so by using the compose binary directly (it's likely installed in
/usr/local/lib/docker/cli-plugins/
, but this path may depend on how you installed);DOCKER_HOST=ssh://blabla /usr/local/lib/docker/cli-plugins/docker-compose up
- can you provide the docker compose file you're using? if the compose file depends on provide source code or non-public images, are you able to provide a "minimal" docker compose file to reproduce the issue (that doesn't depend on your private source and non-public images)?
my local uses docker desktop, but the issue also exist when I run the same command with gitlab ci. also yes using com.docker.cli
reproduces the issue.
here is a video of the issue
docker version
from server
Client:
Version: 20.10.7
API version: 1.41
Go version: go1.15.14
Git commit: f0df350
Built: Wed Nov 17 03:05:36 2021
OS/Arch: linux/amd64
Context: default
Experimental: true
Server:
Engine:
Version: 20.10.7
API version: 1.41 (minimum version 1.12)
Go version: go1.15.14
Git commit: b0f5bc3
Built: Wed Nov 17 03:06:14 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.6
GitCommit: d71fcd7d8303cbf684402823e425e9dd2e99285d
runc:
Version: 1.0.0
GitCommit: 84113eef6fc27af1b01b3181f31bbaf708715301
docker-init:
Version: 0.19.0
GitCommit: de40ad0
DOCKER_HOST=... docker info
Client:
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc., v0.8.1)
compose: Docker Compose (Docker Inc., v2.3.3)
scan: Docker Scan (Docker Inc., v0.17.0)
Server:
Containers: 28
Running: 1
Paused: 0
Stopped: 27
Images: 40
Server Version: 20.10.7
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: d71fcd7d8303cbf684402823e425e9dd2e99285d
runc version: 84113eef6fc27af1b01b3181f31bbaf708715301
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 5.10.102-99.473.amzn2.x86_64
Operating System: Amazon Linux 2
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 965.5MiB
Name: ip-172-31-39-226.eu-central-1.compute.internal
ID: ROM7:G3CD:UZ5W:OC3Q:347K:BD5Y:RDOY:NU4R:JHIW:L5Q6:BBNW:7XLN
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
services:
mongo:
image: mongo
postgres:
image: postgres
redis:
image: redis
nginx:
image: nginx
node:
image: node
Also, I noticed that using docker stack deploy
has no issues. it works as it is supposed to be.
Memory usage by dockerd
occurs because running docker/docker-compose
without -d
(even with -d
only for few seconds), server creates many sshd
threads that consume big chunk of memory:
Client
---
$ DOCKER_HOST=ssh://<User-name>@<Server-IP> docker compose up
Server
---
$ pstree -p $(pgrep -f '/usr/sbin/sshd -D')
sshd(246356)─┬─sshd(337077)───sshd(337114)───docker(337115)─┬─{docker}(337116)
│ ├─{docker}(337117)
│ ├─{docker}(337118)
│ ├─{docker}(337119)
│ ├─{docker}(337120)
│ ├─{docker}(337121)
│ ├─{docker}(337122)
│ ├─{docker}(337123)
│ ├─{docker}(337124)
│ └─{docker}(337125)
├─sshd(337133)───sshd(337170)───docker(337171)─┬─{docker}(337172)
│ ├─{docker}(337173)
│ ├─{docker}(337174)
│ ├─{docker}(337175)
│ ├─{docker}(337176)
│ ├─{docker}(337177)
│ ├─{docker}(337178)
│ ├─{docker}(337179)
│ ├─{docker}(337180)
│ └─{docker}(337181)
├─sshd(337182)───sshd(337219)───docker(337220)─┬─{docker}(337221)
.
.
.
Hm.. right, yes, so it would be attaching to each container in the compose stack to stream the output; I can imaging that causing more overhead, especially with ssh here. Wondering if we can make it reuse connections or something along those lines.
/cc @AkihiroSuda @ndeloof perhaps you have ideas?
Maybe we should re-revert this (with some fix)?
- https://github.com/docker/cli/pull/2303
I will work on this
I don’t think this issue is related to cli
neither solve by this https://github.com/docker/cli/pull/2303
- Killing extra ssh processes on Docker server don’t reduce memory usage:
Client
export DOCKER_HOST=ssh://<User-name>@<Server-IP>
cat > docker-compose.yaml <<EOF
services:
mongo:
image: mongo
postgres:
image: postgres
redis:
image: redis
nginx:
image: nginx
node:
image: node
EOF
docker-compose up
Server
sudo pstree -p $(pgrep -f '/usr/sbin/sshd -D')
sshd(5156)─┬─sshd(825648)───sshd(825707)───bash(825708)───sudo(941496)───sudo(941497)───pstree(941498)
├─sshd(936369)───sshd(936406)───docker(936407)─┬─{docker}(936408)
│ ├─{docker}(936409)
│ ├─{docker}(936410)
│ ├─{docker}(936411)
│ ├─{docker}(936412)
│ ├─{docker}(936413)
│ ├─{docker}(936414)
│ ├─{docker}(936415)
│ ├─{docker}(936416)
│ └─{docker}(936417)
├─sshd(938070)───sshd(938147)───docker(938260)─┬─{docker}(938262)
│ ├─{docker}(938263)
│ ├─{docker}(938264)
│ ├─{docker}(938265)
sudo kill -9 938070 938309 ... <last ssh processID> ## from second docker ssh connections
- Running same commands over ssh consume less memory footprint as Docker, below commands roughly consume same amount of Ram on Server:
$ for i in `seq 10`;
> do ssh -nttf <user-name>@<docker-server-ip> "docker run -it busybox top" 2>&1 &
> done
$ for i in `seq 60`;
> do ssh -nttf <user-name>@<docker-server-ip> "top" 2>&1 &
> done
I can speak to this; the way docker works over SSH remote appears to be:
- Client machine executes docker-cli with
ssh://
- Client docker-cli uses the client machine's ssh binary to connect to ->
- The remote machine sshd server, which then receives an
exec
(SSH protocol request) from the client to:- execute the undocumented command
docker system dial-stdio
as the SSH user - which then turns
stdin
andstdout
into basically a stream transport for thedockerd
REST API.
- execute the undocumented command
In summary:
client docker-cli
<-stdio-> ssh
<-tcp-> sshd
<-stdio-> remote docker-cli
<-unix/npipe-> dockerd
While I myself am not too familiar with compose
's internals, I'd think that an docker compose up
command with many images may create multiple SSH connections, which appear as forks of the remote sshd
process.
I'm currently workshopping a somewhat better solution here at the moment. I haven't made a PR pending further testing, potential cross-platform issues, and error-handling, but also implementation on Docker CLI here. The high-level overview of my changes I plan to make (so far) is:
- Sidestep
dial-stdio
by serving the REST API directly on an separate listener with SSH acting as an encrypted transport (courtesy ofgolang.org/x/crypto/ssh
) - Have a SSH dialler native to Docker CLI which doesn't rely on an external
ssh
client binary -
Multiplex concurrent connections (if needed?[1]) to the same remote host using SSH
session
channels (of which there can be multiple under the single TCP/SSH connection) - Optional use of SSH user keys and host keys certificate to provide mutual authentication, a la TLS.
Hopefully with this architecture, there's less memory overhead as there would hypothetically be just the one process, dockerd
, which handles concurrent connections from Docker CLI clients.
[1] I'm not too certain if this is actually needed, but it is a nice feature. I've already pushed code on my fork to take an accepted ssh.Conn
, and pass it to a goroutine which continuously demultiplexes session
channel requests into a net.Conn
interface for the apiserver
to Accept
and run with.
I'm using a remote SSH Docker context on MacOS running Docker Desktop to deploy stacks to my server, here's the output of docker info
on my local system:
Client:
Version: 24.0.2
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.10.5
Path: /Users/ellie/.docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.18.1
Path: /Users/ellie/.docker/cli-plugins/docker-compose
deployx: Deploy a new stack or update an existing stack (aaraney)
Version: 0.0.1
Path: /Users/ellie/.docker/cli-plugins/docker-deployx
dev: Docker Dev Environments (Docker Inc.)
Version: v0.1.0
Path: /Users/ellie/.docker/cli-plugins/docker-dev
extension: Manages Docker extensions (Docker Inc.)
Version: v0.2.19
Path: /Users/ellie/.docker/cli-plugins/docker-extension
init: Creates Docker-related starter files for your project (Docker Inc.)
Version: v0.1.0-beta.4
Path: /Users/ellie/.docker/cli-plugins/docker-init
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
Version: 0.6.0
Path: /Users/ellie/.docker/cli-plugins/docker-sbom
scan: Docker Scan (Docker Inc.)
Version: v0.26.0
Path: /Users/ellie/.docker/cli-plugins/docker-scan
scout: Command line tool for Docker Scout (Docker Inc.)
Version: v0.12.0
Path: /Users/ellie/.docker/cli-plugins/docker-scout
Server:
Containers: 2
Running: 0
Paused: 0
Stopped: 2
Images: 27
Server Version: 24.0.2
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
runc version: v1.1.7-0-g860f061
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 5.15.49-linuxkit-pr
Operating System: Docker Desktop
OSType: linux
Architecture: aarch64
CPUs: 4
Total Memory: 7.668GiB
Name: docker-desktop
ID: 7c813daa-98e6-446a-9a03-0b4ec69bf2e1
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5555
127.0.0.0/8
Live Restore Enabled: false
I left my computer on overnight and when I checked my servers metrics I noticed sshd was using almost 6 GB of memory. There was hundreds of these ssh sessions and docker system dial-stdio
processes running on my server:
root 11881 0.0 0.1 25484 9472 ? Ss 04:27 0:00 sshd: ellie [priv]
ellie 11887 0.0 0.0 25624 6412 ? S 04:27 0:00 sshd: ellie@notty
ellie 11889 0.0 0.2 1180192 22836 ? Ssl 04:27 0:00 docker system dial-stdio
Does anyone have some insight on this? My system is just constantly creating these sessions for no reason, when I'm not even using the Docker context. There's also a fairly recent forum post about this: Docker Continuously Making Unnecessary SSH Connections to Remote Servers
EDIT: Exiting Docker Desktop closes all of the ssh sessions and exits all the dial-stdio processes on the remote server, however if you leave Docker running it just continuously creates those sessions, eventually leading to a situation where it will use all of the servers memory.