for-mac icon indicating copy to clipboard operation
for-mac copied to clipboard

After a short time, Docker starts returning internal server errors and becomes unusable

Open jrnorth opened this issue 1 year ago • 10 comments

Description

Around ten minutes or so after starting Docker Desktop and several containers, docker commands will start returning messages like the following: request returned Internal Server Error for API route and version http://%2FUsers%2Fjoe%2F.docker%2Frun%2Fdocker.sock/v1.45/containers/json, check if the server supports the requested API version

Likewise, the Docker Desktop application is unable to load any data on any of the tabs.

Either restarting Docker Desktop or quitting and launching it again will resolve the issue, but only for a short time before it happens again.

Reproduce

  1. Start several containers
  2. Wait an unspecified amount of time (at least ten minutes or so), then try to run a docker command
  3. It should hang for a while then fail with the error in the description

Expected behavior

Docker Desktop and the docker commands should continue to work as expected.

docker version

Client:
 Cloud integration: v1.0.35+desktop.13
 Version:           26.1.1
 API version:       1.45
 Go version:        go1.21.9
 Git commit:        4cf5afa
 Built:             Tue Apr 30 11:44:56 2024
 OS/Arch:           darwin/arm64
 Context:           desktop-linux

Server: Docker Desktop 4.30.0 (149282)
 Engine:
  Version:          26.1.1
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.9
  Git commit:       ac2de55
  Built:            Tue Apr 30 11:48:04 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.31
  GitCommit:        e377cd56a71523140ca6ae87e30244719194a521
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client:
 Version:    26.1.1
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.0-desktop.1
    Path:     /Users/joe/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.27.0-desktop.2
    Path:     /Users/joe/.docker/cli-plugins/docker-compose
  debug: Get a shell into any image or container (Docker Inc.)
    Version:  0.0.29
    Path:     /Users/joe/.docker/cli-plugins/docker-debug
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.2
    Path:     /Users/joe/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.23
    Path:     /Users/joe/.docker/cli-plugins/docker-extension
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
    Version:  v1.0.4
    Path:     /Users/joe/.docker/cli-plugins/docker-feedback
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v1.1.0
    Path:     /Users/joe/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/joe/.docker/cli-plugins/docker-sbom
  scout: Docker Scout (Docker Inc.)
    Version:  v1.8.0
    Path:     /Users/joe/.docker/cli-plugins/docker-scout

Server:
 Containers: 17
  Running: 17
  Paused: 0
  Stopped: 0
 Images: 108
 Server Version: 26.1.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: e377cd56a71523140ca6ae87e30244719194a521
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
  cgroupns
 Kernel Version: 6.6.26-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 10
 Total Memory: 31.3GiB
 Name: docker-desktop
 ID: e4963b97-992b-43d4-a832-7ce9c03d69f7
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Labels:
  com.docker.desktop.address=unix:///Users/joe/Library/Containers/com.docker.docker/Data/docker-cli.sock
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: daemon is not using the default seccomp profile

Diagnostics ID

4BA00B44-3ABD-4F0A-B1AD-081C38AB5075/20240521011143

Additional Info

I'm on Ventura 13.6.7. I was on 13.6.6 last week with the same Docker Desktop version and did not have this issue.

jrnorth avatar May 21 '24 04:05 jrnorth

Experiencing the same issue -- I am downgrading back to 4.29.0 for the time being and will monitor this issue for patches

docker ps -a

request returned Internal Server Error for API route and version 
http://%2FUsers%2Fjeremylondon%2F.docker%2Frun%2Fdocker.sock/v1.45/containers/json?all=1, check if the server supports the requested API version

Running v4.30.0 on Sonoma 14.5 with a Apple M3 Pro. I noticed this happen twice now after around 1-3 hours of a container service running, then it freezes and docker desktop becomes unresponsive. No logs, no exec.

docker version
Client:
 Cloud integration: v1.0.35+desktop.13
 Version:           26.1.1
 API version:       1.45
 Go version:        go1.21.9
 Git commit:        4cf5afa
 Built:             Tue Apr 30 11:44:56 2024
 OS/Arch:           darwin/arm64
 Context:           desktop-linux

Server: Docker Desktop 4.30.0 (149282)
 Engine:
  Version:          26.1.1
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.9
  Git commit:       ac2de55
  Built:            Tue Apr 30 11:48:04 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.31
  GitCommit:        e377cd56a71523140ca6ae87e30244719194a521
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
docker info
Client:
 Version:    26.1.1
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.0-desktop.1
    Path:     /Users/jeremylondon/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.27.0-desktop.2
    Path:     /Users/jeremylondon/.docker/cli-plugins/docker-compose
  debug: Get a shell into any image or container (Docker Inc.)
    Version:  0.0.29
    Path:     /Users/jeremylondon/.docker/cli-plugins/docker-debug
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.2
    Path:     /Users/jeremylondon/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.23
    Path:     /Users/jeremylondon/.docker/cli-plugins/docker-extension
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
    Version:  v1.0.4
    Path:     /Users/jeremylondon/.docker/cli-plugins/docker-feedback
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v1.1.0
    Path:     /Users/jeremylondon/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/jeremylondon/.docker/cli-plugins/docker-sbom
  scout: Docker Scout (Docker Inc.)
    Version:  v1.8.0
    Path:     /Users/jeremylondon/.docker/cli-plugins/docker-scout

Server:
 Containers: 2
  Running: 1
  Paused: 0
  Stopped: 1
 Images: 26
 Server Version: 26.1.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: e377cd56a71523140ca6ae87e30244719194a521
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
  cgroupns
 Kernel Version: 6.6.26-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 11
 Total Memory: 7.754GiB
 Name: docker-desktop
 ID: 82224e3c-a63c-4296-a131-9d9f9dc914db
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Labels:
  com.docker.desktop.address=unix:///Users/jeremylondon/Library/Containers/com.docker.docker/Data/docker-cli.sock
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: daemon is not using the default seccomp profile

Diagnostic ID 5B33B463-4C86-4594-87CA-F627D88F8EE5/20240529135858

jeremy-london avatar May 29 '24 14:05 jeremy-london

Rats downgraded to v4.29.0 and same error - Docker ran fine for just under 4 hours and then crashed again and process hung

New Diagnostic ID: 5B33B463-4C86-4594-87CA-F627D88F8EE5/20240529235916

jeremy-london avatar May 30 '24 00:05 jeremy-london

@jeremy-london do you have a docker-compose (which you can share) which could help us reproduce the issue?

jpbriend avatar May 30 '24 15:05 jpbriend

@jpbriend Sure! -- In an effort to try something new I reduced the resource limits to only use 6 out of 11 cpu cores, and tweak my python code to use 5 max workers -- previously i was using 11 cpu cores in the resource limit (max of my machine), and my python script was running 15 max worker threads.

This worked for around 15 hours before crashing like the previous examples.

New Diagnostic ID: 5B33B463-4C86-4594-87CA-F627D88F8EE5/20240530155458

Here is a sample project that simulates what i've got running (Selenium Grid w/ node-docker)

README for Selenium Grid

The goal was to use the node-docker to dynamically create a chromedriver image, then with the python script be able to process multiple URLs at a time (as the core selenium drivers are not thread safe.. this moves the problem to the docker runtime and allows 1 driver per process).. which works great! but sometimes crashes docker desktop...

NOTE: you can replace seleniarm/ with selenium in the compose and config files - if you are x86 linux/amd64. I am on Mac Silicon M3 Pro so using arm based containers linux/arm64

docker-compose.yml

name: web-scraper-grid
services:
  selenium-hub:
    image: seleniarm/hub:4.20
    container_name: selenium-hub
    ports:
      - "4442:4442"
      - "4443:4443"
      - "4444:4444"

  node-docker:
    image: seleniarm/node-docker:4.20
    container_name: node-docker
    shm_size: '2gb'
    volumes:
      # - ./assets:/opt/selenium/assets # Uncomment if you want to use assets to track sessionCapabilities.json
      - ./config.toml:/opt/bin/config.toml
      - /var/run/docker.sock:/var/run/docker.sock
    depends_on:
      - selenium-hub
    environment:
      - SE_EVENT_BUS_HOST=selenium-hub
      - SE_EVENT_BUS_PUBLISH_PORT=4442
      - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
      - SE_START_XVFB=false
      - SE_START_VNC=false

config.toml

[docker]
configs = [
  "seleniarm/standalone-chromium:124.0", '{"browserName": "chrome", "browserVersion": "124.0"}'
]

# URL for connecting to the docker daemon
url = "http://127.0.0.1:2375"

# Assets path (optional mount)
assets-path = "/opt/selenium/assets"

example test.py

# pip install selenium

import logging
import time
import gc
from concurrent.futures import ThreadPoolExecutor, as_completed
from typing import List
from selenium import webdriver
from selenium.common.exceptions import TimeoutException


def setup_selenium_driver(command_executor: str = "http://localhost:4444"):
    options = webdriver.ChromeOptions()
    options.add_argument("--headless=new")
    return webdriver.Remote(command_executor=command_executor, options=options)


def analyze_single_url(driver: webdriver.Remote, url: str, index: int, start_times: List[float], end_times: List[float]):
    start_times[index] = time.time()
    try:
        driver.get(url)
        
        # set a delay to simulate processing time
        time.sleep(4)
        
        end_times[index] = time.time()
        logging.info(f"Item {index+1}: {driver.title}")
    except Exception as e:
        end_times[index] = time.time()
        logging.error(f"Error processing URL {index + 1}: {url} - {e}")
    finally:
        driver.quit()
        gc.collect()


def analyze_urls(urls: List[str], max_workers: int):
    logging.info(f"Analyzing {len(urls)} URLs...")
    start_times, end_times = [0] * len(urls), [0] * len(urls)
    start_time = time.time()
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = [executor.submit(analyze_single_url, setup_selenium_driver(), url, idx, start_times, end_times) for idx, url in enumerate(urls)]
        for future in as_completed(futures):
            try:
                future.result()
            except Exception as e:
                logging.error(f"Error occurred: {e}", exc_info=True)
            finally:
                gc.collect()
    total_time = time.time() - start_time
    logging.info(f"Total Time: {total_time:.2f} seconds for {len(urls)} URLs.")
    logging.info(f"Processing rate: {len(urls) / total_time:.2f} URLs per second.")


if __name__ == "__main__":
    logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
    
    max_workers = 5
    test_urls = ["https://dbad-license.org/"] * 100000
    
    analyze_urls(test_urls, max_workers)

jeremy-london avatar May 30 '24 16:05 jeremy-london

I have the same problem on an Intel Mac Mini 2018 with Sonoma 14.5 with Docker Desktop 4.30.0.

I don't have the problem on an M1 Mac Mini with Sonoma 14.5 with Docker Desktop 4.30.0. In fact it runs pretty flawlessly.

nothing2obvi avatar May 30 '24 19:05 nothing2obvi

I commented this earlier and deleted it, but can now confirm that downgrading to 4.24.0 makes Docker Desktop last a few hours longer before it all becomes unresponsive. Obviously not a fix.

nothing2obvi avatar Jun 01 '24 01:06 nothing2obvi

This problem seems to still exist for 4.31.0.

nothing2obvi avatar Jun 07 '24 20:06 nothing2obvi

@jeremy-london @jrnorth

Have any of you found a workaround to this? I've tried using Rancher Desktop, Colima, and OrbStack but they all introduce their own set of problems.

nothing2obvi avatar Jun 16 '24 23:06 nothing2obvi

I also have this exact issue. Tried Podman as substitute, this has the same issue. Podman UI keeps working, but the containers break down.

It looks like changing from VirtioFS to gRPC FUSE in Docker Desktop settings made it somewhat more stable, but after a day Docker Desktops becomes unresponsive and I’m unable to reach my containers. It seems to pop up when more CPU intensive tasks run on the container and the system clogs. I’m currently auto-restarting Docker Desktop early in the morning to see whether that works.

Sdedeugd avatar Jul 08 '24 19:07 Sdedeugd

Also seeing this in 4.29.0 and 4.31.0

dionjwa avatar Jul 10 '24 04:07 dionjwa

Me and my team are having this problem nearly daily on Apple Silicon macs ranging from 2021 Apple M1 Pro Macbook Pro, to 2024 M3 Max. This seems like the same error as https://github.com/docker/for-mac/issues/6956 and https://github.com/docker/for-mac/issues/7240 and https://github.com/docker/for-mac/issues/6933

We've tried disabling resource saver as described in other issues, but this is becoming worse over time I think.

bdeo avatar Oct 14 '24 15:10 bdeo

Still seeing with docker build 4.48.0 (207573) on (Mac M1 Pro chip, McOS Sequoia 15.6.1)

vijaya314 avatar Oct 28 '25 20:10 vijaya314