docker-nginx icon indicating copy to clipboard operation
docker-nginx copied to clipboard

Nginx container consumes much more memory than expected

Open artem-zinnatullin opened this issue 5 years ago • 12 comments

Setup

We run docker.io/nginx:1.15.8 containers with simple proxy + disk cache configuration:

events {
  worker_connections 2048;
}

http {
  # Similar to default "combined" log format, appends additional info at the end.
  # - request_time: time it took for Nginx to handle request.
  # - upstream_response_time: time it took for upstream (MyService) to handle request.
  # - upstream_cache_status: status of accessing a response cache.
  #   The status can be either `MISS`, `BYPASS`, `EXPIRED`, `STALE`, `UPDATING`, `REVALIDATED` or `HIT`.
  log_format custom '$remote_addr - $remote_user [$time_local] '
                      '"$request" $status $body_bytes_sent '
                      '"$http_referer" "$http_user_agent" '
                      '$request_time $upstream_response_time $upstream_cache_status';

  # - levels: nestiness of cache directories, ie /cache/c/29/b7f54b2df7773722d382f4809d65029c
  # - keys_zone: named memory structure to hold keys, 1 MB allows to store around 8 thousand keys.
  # - max_size: max size of cache entries on disk.
  # — inactive: amount of time a key/value pair can live in cache (even if it was accessed recently).
  # - use_temp_path: on/off to use nginx configured tmp directory.
  proxy_cache_path /opt/nginx-cache/ levels=1:2 keys_zone=myservice_cache:10m max_size=10g inactive=24h use_temp_path=off;

  server {
    listen 80;
    server_name "";

    access_log /var/log/nginx/access.log custom;

    # Caching setup is quite aggressive, we assume MyService to be append-only.
    proxy_cache myservice_cache;

    # Cache authenticated requests by making basic auth `Authorization` header part of a cache key.
    proxy_cache_key "$http_authorization$request_uri";

    proxy_cache_methods GET HEAD;

    # Cache `200` responses for 24 hours.
    proxy_cache_valid 200 24h;

    # Cache `404` responses for 10 minutes, 404 might be resolved to 200 later.
    proxy_cache_valid 404 10m;

    # Allow parallel cache update while we serve stale cache entry.
    proxy_cache_background_update on;

    # Return cache if upstream is down or Nginx is updating the cache atm.
    proxy_cache_use_stale timeout updating http_500 http_502 http_503 http_504;

    proxy_read_timeout 10s;
    proxy_connect_timeout 1s;

    location / {
      proxy_pass http://myservice;
    }

    location /nginx-health {
      access_log off;
      return 200;
    }
  }
}

It runs as Kubernetes pods:

containers:
- name: nginx
  image: docker.io/nginx:1.15.8
  volumeMounts:
  - name: nginx-conf
    mountPath: /etc/nginx/
  - name: nginx-cache
    mountPath: /opt/nginx-cache
  ports:
  - containerPort: 80
    name: http
  resources:
    requests:
      memory: "1G"
      cpu: "1"
    limits:
      memory: "1G"
      cpu: "1"
  livenessProbe:
    httpGet:
      path: "/nginx-health"
      port: 80
    initialDelaySeconds: 1
    periodSeconds: 3
    failureThreshold: 1
    timeoutSeconds: 1
  readinessProbe:
    httpGet:
      path: "/nginx-health"
      port: 80
    initialDelaySeconds: 2
    periodSeconds: 1
    failureThreshold: 1
    timeoutSeconds: 1

Observations

We observe 3 problems with this setup.

Problem 1: Nginx container terminates itself sometimes

We see Nginx container terminating itself with code 0 with no warnings or errors in logs.

State:          Running
      Started:      Thu, 24 Jan 2019 11:26:18 -0800
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 24 Jan 2019 11:00:25 -0800
      Finished:     Thu, 24 Jan 2019 11:26:18 -0800
    Ready:          True
    Restart Count:  1

Logs just have some regular requests it serves all the time.

Problem 2: Nginx times out on a static return 200 health check sometimes

We see Nginx container timing out (1 second) very very simple health check:

location /nginx-health {
  access_log off;
  return 200;
}

We used to log it, but nothing stood out when we saw time outs.

Problem 3: Nginx containers consume ~96% of given memory all the time

We're giving container 1GB of memory as limit. For some reason Nginx consumes pretty much all of it on production load which is not anything special — 60 RPS or so usually and keeps consuming it in after hours when we don't get any requests.

Moreover, we can't figure out what consumes memory in the container:

I've blamed shared memory first, but it seems to be fine too:

root@caching-proxy-0:/# free -m
              total        used        free      shared  buff/cache   available
Mem:          70339       10145       21278          10       38915       59570
Swap:             0           0           0

Exactly 10mb I set for cache key_zone…

Expectations

  1. Nginx container shouldn't terminate itself with exit code 0
  2. Nginx container shouldn't terminate itself without any useful logs
  3. Nginx container shouldn't consume 1GB of memory in this setup (or I guess I just expect it to be within 10-200MB usually)

We're happy to provide additional information.

And we're really curious on finding out where is this memory consumption coming from, the only thing left on mind is Linux page cache…

I've spent hours reading everything I could find, these links seem to be the only ones close to what we observe:

  • https://github.com/moby/moby/issues/21759
  • https://stackoverflow.com/questions/41048742/nginx-content-caching-causing-docker-memory-spike

But still I can't figure how to fix it.

All problems seem to be caused by Problem №3: memory consumption of the container

artem-zinnatullin avatar Jan 24 '19 23:01 artem-zinnatullin

nginx will log something to the error log if it terminates; raising http://nginx.org/r/error_log severity might be needed.

I'm not sure how the "memory consumption" is calculated on your graphs - and it looks very wrong, since htop output shows nginx uses substantionally less. free(1) is useless there since the kernel does not virtualize this.

When the health check fails, what's with the network connectivity for the container? anything in tcpdump?

anything in dmesg output on the nodes where this happens?

thresheek avatar Jan 25 '19 10:01 thresheek

nginx will log something to the error log if it terminates; raising http://nginx.org/r/error_log severity might be needed.

Got it, increased to warn level, rolling out now, will update once we get container restarted

Weird thing is that it exits with code 0, I've never seen Nginx doing that on its own.

I'm not sure how the "memory consumption" is calculated on your graphs - and it looks very wrong, since htop output shows nginx uses substantionally less.

We collect it with prometheus, which should be getting it from container runtime (we use cri-o, not Docker).

Yeah I agree that it's weird that htop output doesn't add up.

free(1) is useless there since the kernel does not virtualize this.

Yep, just wanted to see the output of it, at least it shown shared mem region that matched 10m we set for keys_zone in Nginx.

When the health check fails, what's with the network connectivity for the container? anything in tcpdump?

That I'll have to collect more info on, prob not tcpdump.

anything in dmesg output on the nodes where this happens?

Will comment once we get new restarts.


Btw, forgot to mention that our k8s Ingress is also Nginx, but without caching and it's able to handle same traffic with no problems, I'm wondering if that's due to our caching setup.

It also doesn't happen in our staging environment where we run containers with exact same config, they just don't get prod traffic.


Thanks for jumping on this, @thresheek, really appreciate it 👍 If you have other suggestions we should try, please share, I'll get back with requested logs and data once we collect it.

artem-zinnatullin avatar Jan 28 '19 21:01 artem-zinnatullin

Reading https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt it seems the kernel counts page cache as "used" for a cgroup, which might explain the "used" size on your graphs.. I would also think some k8s watchdog daemon (if not OOM?) sends the kill signal if it reaches a bad enough state.

thresheek avatar Jan 29 '19 09:01 thresheek

Update on logs:

We got 3 + 4 + 0 restarts (3 pods) so far, one of them has this in log right before termination:

2019/01/31 19:08:57 [alert] 1#1: worker process 8 exited on signal 9

No other warnings, alerts or errors. Nothing interesting in dmesg log.

Also kubectl describe doesn't indicate OOM, k8s is pretty informative about it when it terminates containers, we just see Terminated (Completed) and exit code 0. Probably not OOM then.

Reading kernel.org/doc/Documentation/cgroup-v1/memory.txt it seems the kernel counts page cache as "used" for a cgroup, which might explain the "used" size on your graphs.. I would also think some k8s watchdog daemon (if not OOM?) sends the kill signal if it reaches a bad enough state.

Interesting. Page caching is feature of kernel right, probably it's aware of memory limit we set with cgroup and just floats around it. I guess we just don't see that on other deployments because most of them don't work with disk.

Not sure if Nginx can do much about it if it just reads from disk


Do you think it could be this https://trac.nginx.org/nginx/ticket/1163? When Nginx exceeds max_size of disk cache, in our case we set it to 10g and we mount volume with 11g. I wonder if it gets to close to 11g and worker process hangs, that could explain timeouts we observe on health checks right before termination.

When the health check fails, what's with the network connectivity for the container? anything in tcpdump?

Network activity is low, just couple requests within 60 seconds before termination. Unfortunately I can't easily get tcpdump on prod machines atm.

artem-zinnatullin avatar Jan 31 '19 20:01 artem-zinnatullin

@artem-zinnatullin It seems that we're facing a similar issue. Did you manage to find a solution?

Smana avatar Dec 10 '20 07:12 Smana

Not really, it still restarts once in a while with exit code 0 and no useful log messages unfortunately, but i think increasing memory requirements for the container helped reduce the frequency of these restarts

On Wed, Dec 9, 2020 at 11:24 PM Smaine Kahlouch [email protected] wrote:

@artem-zinnatullin https://github.com/artem-zinnatullin It seems that we're facing a similar issue. Did you manage to find a solution?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nginxinc/docker-nginx/issues/303#issuecomment-742299616, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHMDXASKT4VKD4F5E5XCUTSUBZTNANCNFSM4GSHUDRA .

artem-zinnatullin avatar Dec 11 '20 21:12 artem-zinnatullin

it still restarts once in a while with exit code 0

I feel like the answer to why the nginx container is restarting is in the initial description and Kubernetes config.

Nginx times out on a static return 200 health check sometimes

 livenessProbe:
   httpGet:
     path: "/nginx-health"
     port: 80
   initialDelaySeconds: 1
   periodSeconds: 3
   failureThreshold: 1
   timeoutSeconds: 1

When a probe fails, Kubernetes will try failureThreshold times before giving up. Giving up in case of liveness probe means restarting the container.

- https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#configure-probes

worker process 8 exited on signal 9

nginx is not terminating itself, but it does look like one of the worker processes was sent SIGKILL (finding the source of that might tell why). The config tells Kubernetes to stop and start container if it fails liveness twice in a row (i.e. one retry). This could easily happen if the nginx worker process that was handling that liveness request was somehow killed (and perhaps the other workers are still recovering seconds later). Is the container running out of open fds or something that would cause a process to be killed by the kernel?

Possibly similar to https://github.com/docker-library/php/issues/1048

yosifkit avatar Dec 11 '20 23:12 yosifkit

FWIW we are observing inexplicable restarts with exit code 0 without k8s (we are running in docker-compose). In our deployment nginx is used as a very lightweight API proxy and its memory requirements are minimal. Absolutely no clue as to what could be causing this as it happens randomly, under otherwise completely normal conditions.

betaprior avatar Nov 27 '21 03:11 betaprior

The same problem

qixiaobo avatar Nov 22 '22 08:11 qixiaobo

The same problem

dimboknv avatar Mar 06 '23 13:03 dimboknv

FWIW, I've recently seen reports of some containerized software like mysql using excessive amounts of memory and that reproducing across systems was inconsistent because it depended on what was configured for LimitNOFILE in docker.service (systemd unit file, systemctl cat docker.service) which sometimes is assigned the value of infinity which again can depend on how the OS has been configured to resolve that value. On some systems this is over 1 billion, while typically the host is half a million or lower.

If you are experiencing this memory problem, you can try see if it persists when you run the container with the ulimit settings of your host: --ulimit "nofile=$(ulimit -Sn):$(ulimit -Hn)" (if you use docker-compose, there is similar config option in the docs). The soft limit (-Sn) should be 1024 usually, and the hard limit (-Hn) sets the maximum number of files permitted AFAIK, which software can opt-in up to from the soft limit default.

Other software can stall and appear to be hanging, but under heavy CPU activity due to iterating through that entire range to close any open file descriptors as a common practice for daemon services. I've seen this delay start-up from software that'd usually take less than a second to needing 8-70 minutes. I've read reports of a build pipeline taking 10 hours instead of 15 minutes, and of situations where memory usage skyrockets. So it could potentially be the same problem you're experiencing?

polarathene avatar Mar 06 '23 22:03 polarathene

I am seeing this also image Memory limit image I actually have the memory limit set but dosent seem to work (Unraid) --memory=1G --no-healthcheck --log-opt max-size=50m

casperse avatar Feb 25 '24 12:02 casperse