unit icon indicating copy to clipboard operation
unit copied to clipboard

unit:1.34.1-php8.4 consumes more and more memory over time (possibly due to one unitd process)

Open kapyaar opened this issue 9 months ago • 20 comments

Bug Overview

Hi,

I have a project using unit:1.34.1-php8.4. I am testing it in docker, along with redis, and mysql. Everything is working fine, except the php container memory usage keeps increasing.

Docker-compose.yml

services:
  php:
    build:
      context: .
      dockerfile: Dockerfile
    working_dir: /var/www/html/
    container_name: phpApp
    ports:
      - '80:80'
      - '443:443'
    volumes:
      - '.:/var/www/html/'
    networks:
      - default

Dockerfile

RUN apt-get update \    
    && curl -s https://getcomposer.org/installer | php \
    && mv composer.phar /usr/local/bin/composer
	
RUN apt-get update && apt-get install -y \
    libonig-dev \
    libxml2-dev \
    && docker-php-ext-install pcntl

RUN pecl install redis \
    && docker-php-ext-enable redis
	
RUN apt-get update && apt-get install -y supervisor 

COPY config/opcache.ini /usr/local/etc/php/conf.d/opcache.ini


WORKDIR /var/www/html/
COPY config/supervisord.conf /etc/supervisor/conf.d/supervisord.conf

COPY config/config.json init.json
COPY config/cert-bundle.pem init.pem

	
RUN nohup /bin/sh -c "unitd --no-daemon --pid init.pid --log /dev/stdout --control unix:init.unit.sock &" && \
    # Wait for Unit to start (a few seconds to be sure)
    sleep 5 && \
    # Check if the socket is available
    curl --unix-socket init.unit.sock -fsX GET _/config && \
    curl -fsX PUT --data-binary @init.pem --unix-socket init.unit.sock _/certificates/cert-bundle && \
    curl -fsX PUT [email protected] --unix-socket init.unit.sock _/config && \
    rm init.*

EXPOSE 80 443

CMD ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"]

docker status start like below.

Image

When it is exposed to traffic, the memory for phpApp slowly starts to rise. Say, it goes from 120MB at start, to ~400MB in 12 hours or so. And keeps going, to the point the container gets restarted.

This is with traffic, after some time

Image

Trying to narrow down to what might be causing this, I have identified a process in the process list that shows notable cpu usage (3-4%, while the others are in the 1.x %). In the below screenshot, pid 15 is the one I am referring to.

Image

Looking into this process, I could see a lot of entries, trying to access the base machine, with permission denied. Adding privileged: true to the docker-compose file for the phpApp got rid of this error, and I could see a bunch of entries like below

Image Image

Note that, even when I was seeing all those Permission denied entries under PID15, the app is running fine, so it is not affecting the app. If I kill this process, the memory usage comes down instantly.

Image

But another process spawns and starts doing the same thing.

Expected Behavior

Memory usage should remain more or less same for a constant load. But it keeps increasing.

Steps to Reproduce the Bug

I use docker-compose up command to start the containers on AWS EC2.

Environment Details

  • Target deployment platform: [AWS]
  • Target OS: [Amazon Linux 2023]
  • Version of any relevant project languages: [php8.4]

Additional Context

Any help is much appreciated. I can provide any additional info (config.json, supervisor.conf etc if needed).

kapyaar avatar Mar 17 '25 19:03 kapyaar

I think the process in question is unit: router

kapyaar avatar Mar 22 '25 00:03 kapyaar

Hello Team,

Sincerely appreciate some guidance if you can. I have made progress, went through all php scripts to make sure they are not leaking. Finally, adding

"limits": {
        "timeout": 5,
        "requests": 10000
      },
      "processes": {
        "max": 25,
        "spare": 5,
        "idle_timeout": 5
      },

in application section of config.json seems to make a difference, as it keeps restarting php processes, bringing memory usage down. Though, when running overnight, it still does keep climbing, though at much smaller rate.

Any thing else I can try on this regard?

Additionally, I am getting many of the following events in logs

phpApp | 2025/03/27 18:28:46 [info] 14#22 *2596 writev(43, 2) failed (32: Broken pipe) phpApp | 2025/03/27 18:28:46 [info] 14#23 *2617 recv(56, 7F5C40027038, 2048, 0) failed (104: Connection reset by peer) phpApp | 2025/03/27 18:28:55 [info] 14#22 *2619 recv(61, 7F5C48017158, 2048, 0) failed (104: Connection reset by peer) phpApp | 2025/03/27 18:29:02 [info] 14#22 *2616 writev(32, 8) failed (32: Broken pipe)

Interesting part is, I do not see them when running in local docker. This only happens when testing in aws ec2 docker. Looking into eb logs, I can see the following.


/var/log/docker

Mar 27 17:55:13 ip-172-31-68-85 docker[1965]: 2025/03/27 17:55:13 http: superfluous response.WriteHeader call from go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*respWriterWrapper).WriteHeader (wrap.go:98) Mar 27 17:55:13 ip-172-31-68-85 docker[1965]: 2025/03/27 17:55:13 http: superfluous response.WriteHeader call from go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*respWriterWrapper).WriteHeader (wrap.go:98) Mar 27 17:55:13 ip-172-31-68-85 docker[1965]: 2025/03/27 17:55:13 http: superfluous response.WriteHeader call from go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*respWriterWrapper).WriteHeader (wrap.go:98) Mar 27 17:55:13 ip-172-31-68-85 docker[1965]: 2025/03/27 17:55:13 http: superfluous response.WriteHeader call from go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*respWriterWrapper).WriteHeader (wrap.go:98)


/var/log/eb-docker/containers/eb-current-app/eb-stdouterr.log

phpApp | 2025/03/26 23:48:18 [notice] 7#7 process 15 exited with code 0 phpApp | 2025/03/26 23:48:18 [warn] 33#33 [unit] sendmsg(11, 133) failed: Broken pipe (32) phpApp | 2025/03/26 23:48:18 [warn] 33#33 [unit] sendmsg(11, 133) failed: Broken pipe (32) phpApp | 2025/03/26 23:48:18 [warn] 33#33 [unit] sendmsg(11, 133) failed: Broken pipe (32) phpApp | 2025/03/26 23:48:18 [notice] 16#16 app process 33 exited with code 0 phpApp | 2025/03/26 23:48:18 [alert] 16#16 sendmsg(13, -1, -1, 2) failed (32: Broken pipe) phpApp | 2025/03/26 23:48:18 [warn] 35#35 [unit] sendmsg(11, 133) failed: Broken pipe (32) phpApp | 2025/03/26 23:48:18 [warn] 35#35 [unit] sendmsg(11, 133) failed: Broken pipe (32) phpApp | 2025/03/26 23:48:18 [warn] 35#35 [unit] sendmsg(11, 133) failed: Broken pipe (32)

Thanks again!

kapyaar avatar Mar 27 '25 18:03 kapyaar

OK, so it's the PHP application processes that are seemingly leaking memory...

Probably not much we can do unless you are able to provide a reproducer.

You could try isolating certain parts of your scripts, e.g. don't make database connections, don't parse file data etc, see if you can isolate what is causing the issue...

ac000 avatar Mar 27 '25 20:03 ac000

@ac000 Thank you for the response, I have been working on this for the past one week, and It appears to me that the issue is caused by php sessions possibly. May be I am wrong, but let me share my findings.

config.json

{	
    "listeners": {
        "*:80": {
            "pass": "routes"
        }
    },
    "routes": [        
        {
            "match": {
                "uri": [
                    "*.php",
                    "*.php?*"
                ]
            },
            "action": {
                "pass": "applications/php"
            }
        }
    ],
    "applications": {
        "php": {
            "type": "php",			
            "root": "/var/www/html/",
            "processes":20,
            "index": "index.php"
        }
    }	
}

Dockerfile

FROM unit:1.34.2-php8.4
WORKDIR /var/www/html/
RUN apt-get update && apt-get install -y \
    supervisor \
    && rm -rf /var/lib/apt/lists/*
COPY config/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
COPY config/config.json init.json	
RUN nohup /bin/sh -c "unitd --no-daemon --pid init.pid --log /dev/stdout --control unix:init.unit.sock &" && \
    # Wait for Unit to start (a few seconds to be sure)
    sleep 5 && \
    # Check if the socket is available
    curl --unix-socket init.unit.sock -fsX GET _/config && \
    curl -fsX PUT [email protected] --unix-socket init.unit.sock _/config && \
    rm init.*
EXPOSE 80 
CMD ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"]

docker-compose.yml

services:
  php:
    build:
      context: .
      dockerfile: Dockerfile
    working_dir: /var/www/html/
    container_name: phpApp
    ports:
      - '80:80'
      - '443:443'
    volumes:
      - '.:/var/www/html/'    
    networks:
      - default

supervisord.conf

[supervisord]
nodaemon=true
user=root

[program:nginx-unit]
command=unitd --no-daemon --log /dev/stdout
user=root
stdout_logfile_maxbytes = 0
stderr_logfile_maxbytes = 0
stdout_logfile=/var/log/unit.log
stderr_logfile=/var/log/unit.log
autostart=true
autorestart=true
startretries=3
startsecs=10
priority=10

And finally, test.php

<?php 
if (session_status() != PHP_SESSION_ACTIVE) {
    //session_start();
    //$_SESSION['user']="admin";
}
if(isset($_SESSION['user']))
	echo "User: :".$_SESSION['user'];
?>

with this setup, when you start the container, it looks like this,

Image

Tests

  1. Do a load test with session_start() commented out. Memory does increase a little bit, but not much, and stays there.
  2. Uncomment the two lines, and repeat, the mem use goes up. I am using k6 to test.

kapyaar avatar Apr 03 '25 22:04 kapyaar

@kapyaar

Thanks for looking into this, good job on narrowing it down, I'll have a looksee...

ac000 avatar Apr 03 '25 22:04 ac000

@ac000 thanks!!

It did nt strike me until I sent my previous message, to do a test with Apache. Just out of curiosity. It looks like the memory usage in apache also shows a similar behaviour of climbing, though at slower rate. This could be because unit handles more requests than apache, not sure. May be this is normal behaviour? Just wanted to let you know of this finding. If you do come across anything that looks odd, would be great to know. Otherwise, it is still good to know it is sessions that is causing the swell, and possibly address memory limits accordingly.

kapyaar avatar Apr 03 '25 23:04 kapyaar

OK, so here's what I'm seeing...

Starting unit, the php language module application starts out at around 5-6MB, looking at the processes Resident Set Size (RES) in top(1), e.g.

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
  19801 andrew    20   0   5.8m S   0.0   1 0   0.6   0:00.00 unit: "php" appl+ 

With your script with the two commented out lines, hitting it with 1 request results in

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
  19816 andrew    20   0   9.6m S   0.0   1 1   1.0   0:00.00 unit: "php" appl+ 

Hitting it another 10 times results in

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
  19816 andrew    20   0   9.7m S   0.0   1 0   1.0   0:00.02 unit: "php" appl+ 

and another 100 times

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
  19816 andrew    20   0   9.7m S   0.0   1 1   1.0   0:00.04 unit: "php" appl+ 

Lets go for another 1000 times...

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
  19816 andrew    20   0   9.8m S   2.3   1 1   1.0   0:00.24 unit: "php" appl+ 

and another 100, 000 times

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
  19816 andrew    20   0  14.4m S   0.0   1 0   1.5   0:25.94 unit: "php" appl+ 

With the session stuff enabled...

At start

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
 123683 andrew    20   0   5.8m S   0.0   1 1   0.6   0:00.00 unit: "php" appl+ 

1 hit

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
 123683 andrew    20   0   9.7m S   0.0   1 0   1.0   0:00.00 unit: "php" appl+ 

+10 hits

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
 123683 andrew    20   0   9.7m S   0.0   1 0   1.0   0:00.00 unit: "php" appl+ 

+100 hits

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
 123683 andrew    20   0   9.7m S   0.7   1 1   1.0   0:00.04 unit: "php" appl+ 

+1000 hits

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
 123683 andrew    20   0   9.9m S   0.0   1 0   1.0   0:00.61 unit: "php" appl+ 

+10,000 hits

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
 123683 andrew    20   0  10.8m S   0.0   1 1   1.1   0:04.54 unit: "php" appl+ 

100,000 took a long time but even with just 10,000 hits you can see we are getting similar behaviour with and without the session stuff.

It does look like we may be leaking something...

Lets try an empty php script <?php ?>

At start

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
 134859 andrew    20   0   5.8m S   0.0   1 1   0.6   0:00.00 unit: "php" appl+ 

1 hit

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
 134876 andrew    20   0   9.5m S   0.0   1 0   1.0   0:00.00 unit: "php" appl+ 

+10 hits

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
 134876 andrew    20   0   9.5m S   0.0   1 0   1.0   0:00.00 unit: "php" appl+ 

+100 hits

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
 134876 andrew    20   0   9.5m S   0.0   1 1   1.0   0:00.02 unit: "php" appl+ 

+1000 hits

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
 134876 andrew    20   0   9.7m S   0.0   1 0   1.0   0:00.23 unit: "php" appl+ 

+10,000 hits

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
 134876 andrew    20   0  10.4m S   0.0   1 1   1.1   0:02.24 unit: "php" appl+ 

So, similar behaviour again... I'll dig a little deeper.

ac000 avatar Apr 04 '25 15:04 ac000

Interesting!! Here is some info from a test I did today morning that goes along with your observation.

  1. Started a fresh container. Noted the base line (A) info.
  2. Then hit with load without session_start. There is an immediate jump in mem use, but it appears to stall, though if you look at it after a while, it has gone up by several mbs. Still, let's pick a memory value and keep it as a base line (B). To me, this would cover for all the php processes and anything else that need to be running to handle the load.
  3. Now, enable session_start(); Noticeable continued increase while you watch. Keep it going.
  4. without specifying, sessions are saved in /tmp. if you do du -sh inside /tmp, you can see the size increasing obvously.
  5. Say you run this for few minutes, stop the load. Then clear /tmp folder entirely. If the memory increase is only from session files, it should come back to base line (B). It does not. Stays tens of MBs higher than baseline (B).

kapyaar avatar Apr 04 '25 15:04 kapyaar

Right, /tmp is usually on tmpfs which is backed by memory.

Do you see the same issue if you have the sessions stored on a real fs, e.g. /var/tmp (assuming it's not just a symlink to /tmp).

Having played a little with ASAN (Address Sanitizer) it's not showing any run-time leaks (we do seem to leak some memory at startup that isn't freed at shutdown), but yet the PHP application process does seemingly grow over time... next to see what valgrind(1) finds.

ac000 avatar Apr 04 '25 16:04 ac000

I had done something similar already and had similar results, but for confirmiation, I did it again. Made the following changes to Dockerfile

# Create session directory and set correct permissions
RUN mkdir -p /var/lib/php/sessions && \
    chown -R www-data:www-data /var/lib/php/sessions && \
    chmod -R 770 /var/lib/php/sessions

config.json

"applications": {
        "php": {
            "type": "php",			
            "root": "/var/www/html/",
			"user": "www-data",
            "group": "www-data",
            "options": {
                "admin": {
                    "session.save_path": "/var/lib/php/sessions"
                }
            },
            "processes":20,
            "index": "index.php"
        }
    }

Hope this is what you were looking for? And ran the same tests.

  1. Fresh container (baseline A)
CONTAINER ID   NAME      CPU %     MEM USAGE / LIMIT    MEM %     NET I/O       BLOCK I/O   PIDS
705a2a378671   phpApp    0.01%     53.18MiB / 15.49GiB   0.34%     1.05kB / 0B   0B / 0B     41
  1. Load test without session_start() for a few mins. Baseline B
705a2a378671   phpApp    158.43%   71.93MiB / 15.49GiB   0.45%     63.6MB / 58.5MB   0B / 0B     42
  1. Enable session_start() in test.php, and continue test for few mins. Then stop. 705a2a378671 phpApp 0.01% 440.1MiB / 15.49GiB 2.78% 176MB / 190MB 0B / 0B 42

At this point,

root@705a2a378671:/var/lib/php/sessions# du -sh
486M

3a. While I was on another window for a bit, and came back a minute or so later, there is a small jump in mem.

705a2a378671 phpApp 0.01% 458.9MiB / 15.49GiB 1.03% 176MB / 190MB 0B / 0B 43

  1. rm * in sessions folder (had to iterate, due to high number of sess_* files). All files seem gone, though some mem still in use.
root@705a2a378671:/var/lib/php/sessions# ls -lh
total 0
root@705a2a378671:/var/lib/php/sessions# du -sh
8.2M

And phpApp at this point

705a2a378671   phpApp    0.01%     162.9MiB / 15.49GiB   1.03%     176MB / 190MB   0B / 0B     43

While finishing this writeup, I checked back on the container, and interestingly, it is back to

705a2a378671 phpApp 0.02% 53.53MiB / 15.49GiB 0.34% 1.05kB / 0B 0B / 0B 41

This final jump back to starting point made me redo the test again. Same general behaviour, but this time I set a timer and kept looking at the docker stats.

  1. I did not find that small blip (3a) from previous test
  2. Removed the session files.
root@705a2a378671:/var/lib/php/sessions# du -sh
502M 
root@705a2a378671:/var/lib/php/sessions# find -type f -print0 | xargs -0 rm
root@705a2a378671:/var/lib/php/sessions# ls -lh
total 0
root@705a2a378671:/var/lib/php/sessions# du -sh
8.9M

And container goes to 705a2a378671 phpApp 0.01% 150.9MiB / 15.49GiB 0.95% 198MB / 212MB 0B / 0B 42


At about 1 min 45 seconds (there abouts) I saw that major dip that I was looking for. 
`705a2a378671   phpApp    0.01%     109.9MiB / 15.49GiB   0.69%     198MB / 212MB   0B / 0B     42`

However, it did not go all the way back to ~53MB. I waited for quite a bit, but remained at 109MB.

kapyaar avatar Apr 04 '25 18:04 kapyaar

However, it did not go all the way back to ~53MB. I waited for quite a bit, but remained at 109MB.

I think it would be better to look at the unit processes memory usage rather than the container as a whole.

The Kernel will aggressively use free memory (free memory is wasted memory, better to use it for caching).

I think we do leak some memory, valgrind(1) certainly thinks so and testing seems to indicate so, however I don't really see anything particular with sessions...

ac000 avatar Apr 04 '25 20:04 ac000

You are right, Sessions may not have anything to do with this. The only reason I picked it is because after so many days of going back and forth, with inconsistent findings, it was the one thing I could use to repeatably reproduce the behaviour. As you mentioned, I looked into the processes, And findings through weekend may confirm your observation. I disabled most of the session creating scripts, and let it run, exposed to copy of live traffic. The memory swells, but not much on the sessions folder. As you suggested I looked at the processes. I have other containers like redis/mysql etc, but they are running stable, so omitting them for brevity.

CONTAINER ID   NAME              CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O        PIDS
ae5d9e91f7ef   phpApp            32.82%    580.4MiB / 7.629GiB   7.43%     159GB / 148GB     8.19kB / 51MB    31

[root@ip-xx ec2-user]# cd /tmp
[root@ip-xx tmp]# du -sh
40M	.
[root@ip-xx tmp]#

root@ae5d9e91f7ef:/var/www/html# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.3  37028 30840 ?        Ss   Apr03   0:15 /usr/bin/python3 /usr/bin/supervisord -c /etc/supervisor/conf.d/supervisord.conf
root           6  0.0  0.0  35056  6904 ?        S    Apr03   0:00 unit: main v1.34.1 [unitd --no-daemon --log /dev/stdout]
root           7  0.0  0.0   3976  2384 ?        S    Apr03   0:00 /usr/sbin/cron -f
root           9  0.0  0.0   2576   892 ?        S    Apr03   0:00 sh -c sleep 3;php /var/www/html/kafka/redisDataConsumer.php
unit          13  0.0  0.0  22308  6868 ?        S    Apr03   0:00 unit: controller
unit          14  3.8  5.2 1089128 420632 ?      Sl   Apr03  74:37 unit: router
unit          15  0.0  0.3 311812 30392 ?        S    Apr03   0:00 unit: "php" prototype
unit          16  1.3  1.0 885292 84536 ?        R    Apr03  26:08 unit: "php" application
unit          17  1.1  1.0 834108 81760 ?        S    Apr03  21:54 unit: "php" application
unit          18  1.3  1.0 885412 86508 ?        S    Apr03  26:13 unit: "php" application
unit          19  1.3  1.1 885992 88796 ?        S    Apr03  26:10 unit: "php" application
unit          20  1.3  1.0 875052 86604 ?        S    Apr03  26:10 unit: "php" application
unit          21  1.3  1.0 885300 87044 ?        S    Apr03  26:16 unit: "php" application
unit          22  1.3  1.0 885292 86792 ?        R    Apr03  26:20 unit: "php" application
unit          23  1.3  1.0 885312 85656 ?        S    Apr03  26:22 unit: "php" application
unit          24  1.3  1.0 885296 87600 ?        S    Apr03  26:08 unit: "php" application
unit          25  1.3  1.1 885280 90412 ?        S    Apr03  26:07 unit: "php" application
unit          26  1.3  1.1 885284 88432 ?        S    Apr03  26:11 unit: "php" application
unit          27  1.3  1.0 885396 86880 ?        S    Apr03  26:01 unit: "php" application
unit          28  1.3  1.1 885388 90552 ?        S    Apr03  26:20 unit: "php" application
unit          29  1.3  1.0 885396 85948 ?        S    Apr03  26:11 unit: "php" application
unit          30  1.3  1.0 885392 86912 ?        S    Apr03  26:16 unit: "php" application
unit          31  1.3  1.1 885372 88388 ?        S    Apr03  26:11 unit: "php" application
unit          32  1.3  1.0 885384 86780 ?        S    Apr03  26:11 unit: "php" application
unit          33  1.3  1.0 885388 87120 ?        S    Apr03  25:58 unit: "php" application
unit          34  1.3  1.0 885396 86252 ?        S    Apr03  26:15 unit: "php" application
unit          35  1.3  1.0 885396 86500 ?        S    Apr03  26:00 unit: "php" application
root          39  0.3  0.3  91080 28156 ?        S    Apr03   6:04 php /var/www/html/kafka/redisDataConsumer.php
root        4460  0.0  0.0   4320  3748 pts/0    Ss+  Apr04   0:00 bash
root        7305  0.0  0.0   4188  3580 pts/1    Ss   02:44   0:00 bash
root        7375  0.0  0.0   8088  3980 pts/1    R+   02:45   0:00 ps aux
root@ae5d9e91f7ef:/var/www/html# 

At this point, if I kill unit: router

root@ae5d9e91f7ef:/var/www/html# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.3  37028 30840 ?        Ss   Apr03   0:15 /usr/bin/python3 /usr/bin/supervisord -c /etc/supervisor/conf.d/supervisord.conf
root           6  0.0  0.0  35056  6960 ?        S    Apr03   0:00 unit: main v1.34.1 [unitd --no-daemon --log /dev/stdout]
root           7  0.0  0.0   3976  2384 ?        S    Apr03   0:00 /usr/sbin/cron -f
root           9  0.0  0.0   2576   892 ?        S    Apr03   0:00 sh -c sleep 3;php /var/www/html/kafka/redisDataConsumer.php
unit          13  0.0  0.0  22308  6872 ?        S    Apr03   0:00 unit: controller
root          39  0.3  0.3  91080 28156 ?        S    Apr03   6:04 php /var/www/html/kafka/redisDataConsumer.php
root        4460  0.0  0.0   4320  3748 pts/0    Ss+  Apr04   0:00 bash
unit        7382  3.8  0.2 206184 17136 ?        Sl   02:47   0:03 unit: router
unit        7383  0.0  0.3 311816 30392 ?        S    02:47   0:00 unit: "php" prototype
unit        7384  1.3  0.3 316988 27960 ?        S    02:47   0:01 unit: "php" application
unit        7385  1.3  0.3 316984 27808 ?        S    02:47   0:01 unit: "php" application
unit        7386  1.3  0.3 316992 27872 ?        S    02:47   0:01 unit: "php" application
unit        7387  1.3  0.3 316996 27612 ?        S    02:47   0:01 unit: "php" application
unit        7388  1.3  0.3 316996 27892 ?        S    02:47   0:01 unit: "php" application
unit        7389  1.3  0.3 317004 27624 ?        S    02:47   0:01 unit: "php" application
unit        7390  1.3  0.3 317004 27628 ?        S    02:47   0:01 unit: "php" application
unit        7391  1.3  0.3 317004 27800 ?        S    02:47   0:01 unit: "php" application
unit        7392  1.3  0.3 317100 28216 ?        S    02:47   0:01 unit: "php" application
unit        7393  1.3  0.3 316948 27696 ?        S    02:47   0:01 unit: "php" application
unit        7394  1.3  0.3 317100 27760 ?        S    02:47   0:01 unit: "php" application
unit        7395  1.3  0.3 316960 27640 ?        S    02:47   0:01 unit: "php" application
unit        7396  1.3  0.3 317016 27632 ?        S    02:47   0:01 unit: "php" application
unit        7397  1.3  0.3 317084 28116 ?        S    02:47   0:01 unit: "php" application
unit        7398  1.3  0.3 317088 27856 ?        S    02:47   0:01 unit: "php" application
unit        7399  1.3  0.3 317104 28164 ?        S    02:47   0:01 unit: "php" application
unit        7400  1.3  0.3 317088 28088 ?        S    02:47   0:01 unit: "php" application
unit        7401  1.3  0.3 317092 27948 ?        S    02:47   0:01 unit: "php" application
unit        7402  1.3  0.3 317096 28256 ?        S    02:47   0:01 unit: "php" application
unit        7403  1.3  0.3 317100 28084 ?        S    02:47   0:01 unit: "php" application
root        7412  0.9  0.0   4188  3588 pts/1    Ss   02:49   0:00 bash
root        7418  0.0  0.0   8088  4044 pts/1    R+   02:49   0:00 ps aux


CONTAINER ID   NAME              CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
ae5d9e91f7ef   phpApp            32.31%    137.3MiB / 7.629GiB   1.76%     159GB / 149GB     1.49MB / 53.7MB   31

Obviously, the memory difference does not correspond to session files, and killing router brings back mem use down, but not where it started.

Again, This may not mean anything. Just putting it out there so it might help ask better questions? Let me know if you want me to do any tests

kapyaar avatar Apr 07 '25 16:04 kapyaar

@ac000 Just wanted to follow up on this. Is this being looked further into? or do you think the mem issue has not much to do with unit, and is a php thing?

kapyaar avatar Apr 16 '25 13:04 kapyaar

Looking again with PHP 8.4.5.

We do leak memory at startup/shutdown, but not too worried about that.

However I don't currently see any per-request memory leak. I.e. the amount of memory leaked after 1 request is the same as after 10 requests.

That's not say they don't exist and may only happen with certain PHP calls.

However with both

<?php
?>

and

<?php
$file = __DIR__ . '/text.txt';

if (is_file($file) && is_writable($file)) {
    @unlink($file);
    echo '<small style="color: #ccc;">' . $file . ' was deleted.</small><br>' . PHP_EOL;
}

echo '<p>Calling to <code>fastcgi_finish_request()</code>.</p>' . PHP_EOL;

echo '<p>If success, the file ' . $file . ' will be created.</p>' . PHP_EOL;

if (function_exists('fastcgi_finish_request')) {
    fastcgi_finish_request();
} else {
    echo '<p style="color: red;">This server does not support <code>fastcgi_finish_request()</code> function.</p>' . PHP_EOL;
    echo 'Exit now.<br>' . PHP_EOL;
    exit();
}

echo 'This line will be not echo out.<br>' . PHP_EOL;

file_put_contents($file, date('Y-m-d H:i:s') . PHP_EOL, FILE_APPEND);
?>

I'm not seeing any...

ac000 avatar Apr 16 '25 18:04 ac000

@ac000 Yes, with basic php scripts like what you refer to, it does not seem to cause any memory increase. I get the same results as what you are seeing. However, when I run my project, there is obviously some issue. I narrowed down to sessions as it was what I could use to repeatably reproduce this behaviour. May be it is something else altogether, but I don;t know what. Hence, I am sharing some test results, and I am requesting for some comparison tests from your end.

if you run test.php in comment with

a. sessions created per request, and b. changed config.json to have just 2 (instead of 20) php processes to keep things simpler,

do a K6 load test, You do not observe the memory swell in docker stats? or htop?

my k6 script looks like

import http from 'k6/http';
import { check } from 'k6';
export default function () {
  const res = http.get('http://localhost/test.php');

  check(res, {
    'status is 200': (r) => r.status === 200,
  });
}

And I do the following

k6 run --vus 50 --duration 100s script.js

Here are some screenshots on what I get.

1. On boot

Image

Image

2. test.php without session, after 100 seconds of k6 run. http_reqs......................: 173165 1731.197939/s

Image

Image

3. Restart container to reset memory baseline, test.php with sessions, after 100 seconds of k6 run. http_reqs......................: 77856 777.438301/s Image

Image

OK, at this point, the container swells to 288MB. The sessions folder is 310MB. Not sure why container shows lower than sessions folder. I delete all session files. And I get

Image

Image

Container stays at 112MB. I am curious if it is just me seeing this? if not, is it not right to assume that the container memory usage should atleast come down to ~52MB or thereabouts from what we saw in test results without sessions?

kapyaar avatar Apr 22 '25 16:04 kapyaar

I'm really only concerned with the memory usage of Unit. The PHP processes seem to start at around 10M then finish off at around 25M.

If you continue your testing, do they grow even bigger?

I am not seeing any obvious memory leak from


<?php 
    if (session_status() != PHP_SESSION_ACTIVE) {
        session_start();
        $_SESSION['user']="admin";
    }

    if (isset($_SESSION['user']))
	echo "User: :".$_SESSION['user'];
?>

valgrind(1) reports

LEAK SUMMARY:
    definitely lost: 408 bytes in 2 blocks
    indirectly lost: 0 bytes in 0 blocks
      possibly lost: 1,664 bytes in 13 blocks
    still reachable: 68,526 bytes in 1,804 blocks
         suppressed: 2,421,046 bytes in 17,365 blocks

For a single request and after 10 requests. What is leaked is just one off initialisation stuff...

Again, I'm not sayiing 100% there no leaks, it's just that I'm not currently seeing a smoking gun...

However, I am seeing that some scripts will cause a 1-off memory increase, E.g.

Just starting Unit

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
  18997 andrew    20   0   6.2m S   0.0   1 1   0.6   0:00.00 unit: "php" appl+ 

After calling the above session script

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
  18997 andrew    20   0  10.2m S   0.0   1 1   1.1   0:00.00 unit: "php" appl+ 

Well we know that already... but lets now load a different script that just does a phpinfo()

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
  18997 andrew    20   0  12.8m S   0.0   1 0   1.3   0:00.00 unit: "php" appl+ 

Let's do it the other way around, phpinfo() then the session script

At start

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
  19032 andrew    20   0   6.2m S   0.0   1 1   0.6   0:00.00 unit: "php" appl+ 

After phpinfo()

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
  19032 andrew    20   0  12.6m S   0.0   1 1   1.3   0:00.00 unit: "php" appl+ 

After session script

    PID USER      PR  NI    RES S  %CPU nTH P  %MEM     TIME+ COMMAND           
  19032 andrew    20   0  12.8m S   0.0   1 1   1.3   0:00.00 unit: "php" appl+ 

So we ended up at around the same amount.

But these are just 1-off increases, not per-request...

Are you doing anything with memcached or the likes or database stuff in general where memory may not be getting freed from used connections?

Depending on what your using there could be stuff being leaked/cached in underlying libraries.

Are you using an opcache?

is it not right to assume that the container memory usage should atleast come down to ~52MB or thereabouts

I'm not sure I would necessarily assume that.

If you want to investigate your overall memory usage, /proc/meminfo is a good place to start... but even free -h should give some clue to buffer/cache usage...

ac000 avatar Apr 22 '25 18:04 ac000

@ac000 thank you for the hints. free -h shows buff/cache is a swell spot. After bunch of testing, It appears to me that a combination of

  1. periodic clearing of buff/ cache
  2. occasional kill of "unit: router"
  3. setting limits for php processes

seems to keep things under check. Which brings to my questions,

  1. Given killing unit: router makes a difference in terms of the memory usage (some times hundreds of MBs), Is there any other places that I could check in, for swell spots?

  2. Is it ok to periodically kill the unit: router? Any potential issues that could arise from doing so? It seems to be working fine so far.

  3. Also, I do get a lot of (say in the tens per minute)

[info] 25#30 *2837 recv(135, 7F42601BD4F8, 2048, 0) failed (104: Connection reset by peer)
[info] 25#30 *2869 recv(186, 7F4260070958, 2048, 0) failed (104: Connection reset by peer)

and

[info] 25#29 *2873 writev(191, 8) failed (32: Broken pipe)
[info] 25#30 *2790 writev(43, 8) failed (32: Broken pipe)

I read your comment in another issue that this can be safely ignored. Just wanted to check the same in this context. I do use

if (function_exists('fastcgi_finish_request')) {
    fastcgi_finish_request();
}

Could this be a reason for those messages?

kapyaar avatar May 03 '25 22:05 kapyaar

Could you confirm exactly which process(es) it is that keeps growing in size?

If it's the php application processes then you can always add something like

    "limits": {
        "requests": 10000
    }

To the php application part of your config which will kill and start new application process after 10,000 requests, adjust as desired.

The "Connection reset by peer" messages are not particularly related to the use of fastcgi_finish_request()

ac000 avatar May 06 '25 15:05 ac000

Thank you.

  1. Could you confirm exactly which process(es) it is that keeps growing in size?

On the whole, it is the container that keeps growing. As I mentioned in #Comment. Drilling down to details, It appears to me that php processes, buff/cache, and unit: router have some role to play (atleast in my case). Which is fine, as long the source is known. It would be great to know if occasionally killing "unit: router" is ok?

  1. Is it worth tracking down the root cause of "Connection reset by peer" messages in the logs? If so, what should I be looking into?

kapyaar avatar May 08 '25 17:05 kapyaar

It would be great to know if occasionally killing "unit: router" is ok?

I mean it's OK as in in gets restarted by the main unit process. Of course you kill any active connections...

Is it worth tracking down the root cause of "Connection reset by peer" messages in the logs

Probably not, we don't even say which peer... we just log these syscalls on any error...

ac000 avatar May 08 '25 21:05 ac000