unit icon indicating copy to clipboard operation
unit copied to clipboard

unit 1.26.1 + php 7.4 + debian 11 is hanging up

Open iprok opened this issue 3 years ago • 2 comments

Hello!

It's the english version of https://mailman.nginx.org/archives/list/[email protected]/thread/6O5YYIXLBYRQDJMYT6JEW6XZ7EEKGMGY/

We have moved our nextcloud instance to unit 1.26.1 and we are facing the issue when unit stops responding to all requests. Restarting the unit helps, but sometimes needs kill -9.

We have an nginx as reverse proxy in front of unit and we can see the record in its logs: 2022/03/11 10:48:37 [error] 156145#156145: *43083 upstream timed out (110: Connection timed out) while reading response header from upstre am, client: 46.55, server: c.com, request: "GET /thecloud/ocs/v2.php/apps/activity/api/v2/activity?since=1883732&li mit=50&format=json HTTP/1.1", upstream: "[http://127.0.0.1:8000/thecloud/ocs/v2.php/apps/activity/api/v2/activity?sinc...](http://127.0.0.1:8000/thecloud/ocs/v2.php/apps/activity/api/v2/activity?since=1883732&limit=50&fo) rmat=json", host: "c.com"

We turned on debug in the unit and see huge amount of "Resource temporarily unavailable" in the logs, but their timestamps don't relate to the problem.

grep -c 'Resource temporarily unavailable' /var/log/unit.log
94542

Anyway we have tried to increase NoFiles to 100000 and moved php sessions to redis. Without any success.

Part of unit log when the problem starts:

2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 h1p body rest: 0
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 http application 
handler
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 rpc: stream 
#23217 registered
2022/03/11 10:48:36.250 [debug] 244884#244901 mp 7F0600049E30 retain: 2
2022/03/11 10:48:36.250 [debug] 244884#244901 
pthread_mutex_lock(55E27D377000) enter
2022/03/11 10:48:36.250 [debug] 244884#244901 
pthread_mutex_unlock(55E27D377000) exit
2022/03/11 10:48:36.250 [debug] 244884#244901 mp 7F0600049E30 retain: 3
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 request 1345 
bytes shm buffer
2022/03/11 10:48:36.250 [debug] 244884#244901 mp 7F0600000BB0 retain: 25
2022/03/11 10:48:36.250 [debug] 244884#244901 
pthread_mutex_lock(55E27D377138) enter
2022/03/11 10:48:36.250 [debug] 244884#244901 
pthread_mutex_unlock(55E27D377138) exit
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 outgoing mmap buf 
allocation: 7F0600028600 [7F06091F5000,16384] 244884->0,0,24
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 fields_count=13
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0xE6EB, 
0, 4, 7F0600049171 : 21 7F0600049177
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0x7F29, 
0, 15, 7F060004918E : 12 7F060004919F
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0x25AB, 
0, 10, 7F06000491AD : 5 7F06000491B9
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0x1EA0, 
0, 14, 7F06000491C0 : 3 7F06000491D0
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0xA841, 
0, 5, 7F06000491D5 : 1 7F06000491DC
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0xB29A, 
0, 13, 7F06000491DF : 42 7F06000491EE
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0x0220, 
0, 10, 7F060004921A : 145 7F0600049226
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0xD871, 
0, 6, 7F06000492B9 : 3 7F06000492C1
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0x5F7D, 
0, 12, 7F06000492C6 : 23 7F06000492D4
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0xEDFF, 
0, 12, 7F06000492ED : 36 7F06000492FB
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0x23F2, 
0, 6, 7F0600049321 : 277 7F0600049329
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0x9683, 
0, 15, 7F0600049440 : 13 7F0600049451
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 add field 0xEA97, 
0, 15, 7F0600049460 : 10 7F0600049471
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 about to send 
1345 bytes buffer to app process port 12
2022/03/11 10:48:36.250 [debug] 244884#244901 *22759 queue is not empty
2022/03/11 10:48:36.250 [debug] 244884#244901 timer found minimum: 
313989843±50:313748980
2022/03/11 10:48:36.250 [debug] 244884#244901 epoll_wait(36) 
timeout:240863
2022/03/11 10:48:37.114 [debug] 244884#244901 epoll_wait(36): 1
2022/03/11 10:48:37.114 [debug] 244884#244901 *22733 epoll: fd:17 
ev:2001 d:7F060001D390 rd:2 wr:0
2022/03/11 10:48:37.114 [debug] 244884#244901 timer expire minimum: 
313989793:313749844
2022/03/11 10:48:37.114 [debug] 244884#244901 timer found minimum: 
313989843±50:313749844
2022/03/11 10:48:37.114 [debug] 244884#244901 epoll_wait(36) 
timeout:239999
2022/03/11 10:48:37.146 [debug] 244884#244901 epoll_wait(36): 1
2022/03/11 10:48:37.146 [debug] 244884#244901 *22735 epoll: fd:19 
ev:2001 d:7F06000029C0 rd:2 wr:0
2022/03/11 10:48:37.146 [debug] 244884#244901 timer expire minimum: 
313989793:313749876
2022/03/11 10:48:37.146 [debug] 244884#244901 timer found minimum: 
313989843±50:313749876
2022/03/11 10:48:37.146 [debug] 244884#244901 epoll_wait(36) 
timeout:239967
2022/03/11 10:48:37.190 [debug] 244884#244901 epoll_wait(36): 1
2022/03/11 10:48:37.190 [debug] 244884#244901 *22736 epoll: fd:20 
ev:2001 d:7F060000F710 rd:2 wr:0
2022/03/11 10:48:37.190 [debug] 244884#244901 timer expire minimum: 
313989793:313749920
2022/03/11 10:48:37.190 [debug] 244884#244901 timer found minimum: 
313989843±50:313749920
2022/03/11 10:48:37.190 [debug] 244884#244901 epoll_wait(36) 
timeout:239923
2022/03/11 10:48:43.946 [debug] 244884#244901 epoll_wait(36): 1
2022/03/11 10:48:43.946 [debug] 244884#244901 epoll: fd:14 ev:0001 
d:7F0600001430 rd:5 wr:0
2022/03/11 10:48:43.946 [debug] 244884#244901 timer expire minimum: 
313989793:313756676
2022/03/11 10:48:43.946 [debug] 244884#244901 work queue: accept
2022/03/11 10:48:43.946 [debug] 244884#244901 accept4(14): 102
2022/03/11 10:48:43.946 [debug] 244884#244901 client: 127.0.0.1
2022/03/11 10:48:43.946 [debug] 244884#244901 malloc(168): 7F0600005F70
2022/03/11 10:48:43.946 [debug] 244884#244901 posix_memalign(16, 464): 
7F060004B2E0
2022/03/11 10:48:43.946 [debug] 244884#244901 *22761 connections: 26
2022/03/11 10:48:43.946 [debug] 244884#244901 work queue: accept
2022/03/11 10:48:43.946 [debug] 244884#244901 accept4(14) (11: 
Resource temporarily unavailable)
2022/03/11 10:48:43.946 [debug] 244884#244901 work queue: read
2022/03/11 10:48:43.946 [debug] 244884#244901 *22760 http conn init
2022/03/11 10:48:43.946 [debug] 244884#244901 work queue: read
2022/03/11 10:48:43.946 [debug] 244884#244901 *22760 conn read fd:102 
rdy:0 cl:0 er:0 bl:0
2022/03/11 10:48:43.946 [debug] 244884#244901 *22760 epoll 36 set 
event: fd:102 op:1 ev:80002001
2022/03/11 10:48:43.946 [debug] 244884#244901 *22760 timer add: 0±50 
180000:313936676
2022/03/11 10:48:43.946 [debug] 244884#244901 *22760 timer change: 
313936676±50:1
2022/03/11 10:48:43.946 [debug] 244884#244901 timers changes: 1

unit configuration:

    {
        "settings": {
            "http": {
                "max_body_size": 10737418240,
                "body_read_timeout": 3600
            }
        },
        "listeners": {
            "127.0.0.1:8000": {
                "pass": "routes",
            "client_ip": {
                "header": "X-Forwarded-For",
                "source": [
                "127.0.0.1"
                ]
            }
            }
        },

    "routes": [
            {
                "match": {
                    "uri": [
                        "/thecloud/core/ajax/update.php*",
                        "/thecloud/cron.php*",
                        "/thecloud/index.php*",
                        "/thecloud/ocm-provider*.php*",
                        "/thecloud/ocs-provider*.php*",
                        "/thecloud/ocs/v1.php*",
                        "/thecloud/ocs/v2.php*",
                        "/thecloud/public.php*",
                        "/thecloud/remote.php*",
                                            "/thecloud/status.php*",
                        "/thecloud/updater*.php*"
                    ]
                },

    "action": {
                    "pass": "applications/nextcloud/direct"
                }
            },
            {
                "match": {
                    "uri": "/thecloud/ocm-provider*"
                },

    "action": {
                    "pass": "applications/nextcloud/ocm"
                }
            },
            {
                "match": {
                    "uri": "/thecloud/ocs-provider*"
                },

    "action": {
                    "pass": "applications/nextcloud/ocs"
                }
            },
            {
                "match": {
                    "uri": "/thecloud/updater*"
                },

    "action": {
                    "pass": "applications/nextcloud/updater"
                }
            },
            {
                "match": {
                    "uri": "/thecloud/*"
                },
                "action": {
                    "share": "/usr/local$uri",
                    "fallback": {
                        "pass": "applications/nextcloud/index"
                    }
                }
            },
            {
                "action": {
                    "return": 403
                }
            }
        ],

    "applications": {
            "nextcloud": {
                "user": "php-nextcloud",
                "group": "php-nextcloud",
                "limits": {
                        "timeout": 300,
                        "requests": 10000
                },
                    "processes": {
                    "max": 20,
                    "spare": 10,
                    "idle_timeout": 300
                },
                "options": {
                    "admin": {
                        "memory_limit": "512M",
                        "post_max_size": "10G",
                        "upload_max_filesize": "10G",
                        "max_execution_time": "3600",
                        "max_input_time": "3600"
                    }
                },
                "type": "php",
                "targets": {
                    "direct": {
                        "root": "/usr/local/"
                    },

    "index": {
                        "root": "/usr/local/nextcloud/",
                        "script": "index.php"
                    },

    "ocm": {
                        "root": "/usr/local/nextcloud/ocm-provider/",
                        "script": "index.php"
                    },

    "ocs": {
                        "root": "/usr/local/nextcloud/ocs-provider/",
                        "script": "index.php"
                    },

    "updater": {
                        "root": "/usr/local/nextcloud/updater/",
                        "script": "index.php"
                    }
                }
            }
        }
    }

iprok avatar Mar 14 '22 06:03 iprok

thanks for reporting this and sorry for the late response. Is the application process still running when you notice a timeout on the proxy server? ps -elf |grep unit will show the running processes. any dmesg output available for this?

tippexs avatar Mar 15 '22 08:03 tippexs

Hello! Yes the application processed are running when timeouts starts to occure. No relevant dmesg output. We suggested that the problem was caused by some changes in 1.26.1 and we downgraded to the unit 1.25. Seems that this version doesn't have the problem (no timeouts for more than 24h and 1.26.1 gave us timeouts 2-3 times a day).

Just for you information Max Romanov is supporting us in the mailing list https://mailman.nginx.org/archives/list/[email protected]/thread/6O5YYIXLBYRQDJMYT6JEW6XZ7EEKGMGY/ .

iprok avatar Mar 15 '22 09:03 iprok