unit icon indicating copy to clipboard operation
unit copied to clipboard

The last versions of Nginx Unit fails in the TechEmpower benchmark

Open joanhey opened this issue 2 years ago • 42 comments

Hi, I added Nginx and Nginx Unit to the TechEmpower benchmark. But in the last versions it's failing with PHP: https://tfb-status.techempower.com/unzip/results.2022-11-25-12-16-16-174.zip/results/20221119184855/php-unit/run/php-unit.log

Old config without problems, and faster than Nginx: https://github.com/TechEmpower/FrameworkBenchmarks/blob/R20/frameworks/PHP/php/php-unit.dockerfile https://github.com/TechEmpower/FrameworkBenchmarks/blob/R20/frameworks/PHP/php/deploy/nginx-unit.json

Actual config with problems: https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/PHP/php/php-unit.dockerfile https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/PHP/php/deploy/nginx-unit.json

The last versions fail a lot.

Please, could anybody review and help with this problem. Thank you.

PD: Welcome help with the Nginx config too, only with the plaintext test. https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/C/nginx/nginx.conf

Be careful, as the bench changed a lot after the spectre/meldown patches. https://github.com/TechEmpower/FrameworkBenchmarks/issues/7321

If you need I'll send you the benchs results. And ask me for any doubt.

joanhey avatar Nov 29 '22 16:11 joanhey

Looking at the log it seems

php-unit: /usr/local/bin/docker-entrypoint.sh: Applying configuration /docker-entrypoint.d/nginx-unit.json
php-unit: 2022/11/23 16:50:03 [info] 21#21 "benchmark" prototype started

Unit was started with some initial config.

php-unit: 2022/11/23 16:50:03 [info] 105#105 "benchmark" application started
php-unit: /usr/local/bin/docker-entrypoint.sh: OK: HTTP response status code is '200'
php-unit: {
php-unit: 	"success": "Reconfiguration done."
php-unit: }
php-unit: /usr/local/bin/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/...
php-unit: /usr/local/bin/docker-entrypoint.sh: Stopping Unit daemon after initial configuration...
php-unit: /usr/local/bin/docker-entrypoint.sh: Waiting for control socket to be removed...
php-unit: 2022/11/23 16:50:03 [alert] 21#21 sendmsg(20, -1, -1, 1) failed (104: Connection reset by peer)

After it was started it was reconfigured, then shutdown. All that happened within the space of a second.

From your point of view, what was the actual sequence of events?

ac000 avatar Nov 29 '22 17:11 ac000

The strange thing is that in local work. Show all that verbose output, but work.

Finish here and start to verify the tests: image

When finish the tests and close the docker: image

Before show the summary: image

But not in github actions or the bench server.

Github actions log failing: https://github.com/TechEmpower/FrameworkBenchmarks/actions/runs/3567839471/jobs/5996024544#step:9:4375

joanhey avatar Nov 29 '22 22:11 joanhey

It's failing from the last 2 versions of Nginx Unit. And sometimes work !!

Here are the logs from an old run, using unit_1.21.0 and php7.4 working without problems. https://tfb-status.techempower.com/unzip/results.2021-01-13-13-38-00-107.zip/results/20201229183947/php-unit

And an actual, using FROM nginx/unit:1.28.0-php8.1 https://tfb-status.techempower.com/unzip/results.2022-11-25-12-16-16-174.zip/results/20221119184855/php-unit

A recent run that half worked, using FROM nginx/unit:1.27.0-php8.1 https://tfb-status.techempower.com/unzip/results.2022-11-13-10-29-24-097.zip/php-unit A picture from the fortunes results, from this run. image But json, db and query tests failed the verification and the log show a lot of errors: https://tfb-status.techempower.com/unzip/results.2022-11-13-10-29-24-097.zip/results/20221107183025/php-unit/run/php-unit.log I updated for that to v1.28, but now never work.

joanhey avatar Nov 29 '22 22:11 joanhey

When I send the PR for Nginx Unit 1.28 was passing the github actions tests. But never in the bench server, and later again failed in github actions.

https://github.com/TechEmpower/FrameworkBenchmarks/pull/7691

Github actions log working: https://github.com/TechEmpower/FrameworkBenchmarks/actions/runs/3447671893/jobs/5753927577#step:9:2154

joanhey avatar Nov 29 '22 23:11 joanhey

As discussed on the community slack https://nginxcommunity.slack.com/archives/C02SS9UB85C/p1670008983875989?thread_ts=1669810946.221959&cid=C02SS9UB85C

This issue is related to #728 and friends. A manual patch of the docker-entrypoint.sh file seems to fix the issue.

Linking the external issue as well.

https://github.com/TechEmpower/FrameworkBenchmarks/pull/7757#pullrequestreview-1203015505

As said, we will ship a fix for the entrypoint script with the next Unit release 1.29.

tippexs avatar Dec 02 '22 19:12 tippexs

If work in the next run. I'll close the issue.

joanhey avatar Dec 02 '22 20:12 joanhey

I will close it once the fix was merged if that's okay for you

tippexs avatar Dec 02 '22 20:12 tippexs

Failed in the last run:

image

image

When finish the run, I'll send the logs.

joanhey avatar Dec 27 '22 00:12 joanhey

Yes, please share the logs. The new images should have the new docker entrypoint script. In case this is another kind of issue we should Create a new one.

tippexs avatar Dec 27 '22 08:12 tippexs

It's using the patch in v1.28, and looks like the same issue. https://tfb-status.techempower.com/unzip/results.2022-12-28-17-59-18-855.zip/results/20221222172047/php-unit/build/php-unit.log

https://tfb-status.techempower.com/unzip/results.2022-12-28-17-59-18-855.zip/results/20221222172047/php-unit/run/php-unit.log

I'll update to v1.29

joanhey avatar Dec 29 '22 10:12 joanhey

Failed in the last run, using v1.29. https://github.com/TechEmpower/FrameworkBenchmarks/pull/7839

Sunit (scala) from @lolgab, also fail with v1.28 and the last commit with v1.29. https://github.com/TechEmpower/FrameworkBenchmarks/pull/7815 Sunit log with v1.29: https://tfb-status.techempower.com/unzip/results.2023-01-04-02-03-49-510.zip/snunit

The only framework using v1.28 and working is Fastapi (python): https://github.com/TechEmpower/FrameworkBenchmarks/tree/master/frameworks/Python/fastapi Log: https://tfb-status.techempower.com/unzip/results.2023-01-04-02-03-49-510.zip/results/20221229020235/fastapi-nginx-unit

joanhey avatar Jan 09 '23 11:01 joanhey

It's the same issue. It is not necessary to open a new one.

joanhey avatar Jan 09 '23 11:01 joanhey

Update, Fastapi (python) fail with v1.29 in the tests.

https://github.com/TechEmpower/FrameworkBenchmarks/actions/runs/3745949871/jobs/6360874318#step:9:1796

joanhey avatar Jan 14 '23 12:01 joanhey

It's using the patch in v1.28, and looks like the same issue. https://tfb-status.techempower.com/unzip/results.2022-12-28-17-59-18-855.zip/results/20221222172047/php-unit/build/php-unit.log

https://tfb-status.techempower.com/unzip/results.2022-12-28-17-59-18-855.zip/results/20221222172047/php-unit/run/php-unit.log

I'll update to v1.29

This issue is different from the other once. It is not related to the control socket. Will have a look on it.

tippexs avatar Jan 14 '23 16:01 tippexs

Update, Fastapi (python) fail with v1.29 in the tests.

https://github.com/TechEmpower/FrameworkBenchmarks/actions/runs/3745949871/jobs/6360874318#step:9:1796

This log has these lines in it:

fastapi-nginx-unit-orjson: 2022/12/21 02:57:25 [alert] 0#25 [unit] Python failed to call 'asyncio.get_event_loop' Notice: fastapi-nginx-unit-orjson: 2022/12/21 02:57:25 [notice] 24#24 app process 25 exited with code 1

Which seems like it's the same issue as outlined in #815 perhaps?

travisbell avatar Jan 14 '23 16:01 travisbell

Yes indeed! This is exactly this issue. We debugged the Python code last night. 1.29 is working with Python 3.10.8. anything newer than this will currently fail. We are working on a fix and I will let you know once available. Thanks for the links.

tippexs avatar Jan 14 '23 16:01 tippexs

Yes indeed! This is exactly this issue. We debugged the Python code last night. 1.29 is working with Python 3.10.8. anything newer than this will currently fail. We are working on a fix and I will let you know once available. Thanks for the links.

Any progress on this issue?

micaelmalta avatar Jan 18 '23 16:01 micaelmalta

@micaelmalta (and anyone else) feel free to try the patch here https://github.com/nginx/unit/issues/815#issuecomment-1396132323

ac000 avatar Jan 18 '23 21:01 ac000

I have pushed a branch that contains the above fix, if that facilitates easier testing.

ac000 avatar Jan 23 '23 02:01 ac000

V1.29.1 still fail in the benchmark tests, almost with PHP. When finish the run, I'll send the logs.

joanhey avatar Mar 08 '23 13:03 joanhey

Logs: https://tfb-status.techempower.com/unzip/results.2023-03-10-06-06-32-467.zip/php-unit

https://tfb-status.techempower.com/unzip/results.2023-03-10-06-06-32-467.zip/fastapi-nginx-unit

Check build and run dirs

joanhey avatar Mar 11 '23 20:03 joanhey

@joanhey Thanks. We will have a look into this. More details on Monday.

tippexs avatar Mar 17 '23 10:03 tippexs

My Scala Native benchmark using libunit works using NGINX Unit 1.29.1.

lolgab avatar Mar 29 '23 15:03 lolgab

Work but with a lot of errors.

Run logs:

  • snunit: https://tfb-status.techempower.com/unzip/results.2023-03-23-04-22-59-528.zip/results/20230317041717/snunit/run/snunit.log
  • PHP unit: Don't work https://tfb-status.techempower.com/unzip/results.2023-03-23-04-22-59-528.zip/results/20230317041717/php-unit/run/php-unit.log

joanhey avatar Mar 29 '23 15:03 joanhey

With unit v1.31.0 also fail in the GH Actions tests in the bench. FROM unit:1.31.0-php8.2

https://github.com/TechEmpower/FrameworkBenchmarks/actions/runs/6079079817/job/16491119686#step:9:5995

As now also file in the test, it's easier to test it isolated to fix the problem.

joanhey avatar Sep 05 '23 12:09 joanhey

What is failing exactly?

ac000 avatar Sep 05 '23 14:09 ac000

I mean the only errors I'm seeing for php-unit are a bunch of socket connection errors

Verifying test db for php-unit caused an exception: HTTPConnectionPool(host='tfb-server', port=8080): Max retries exceeded with url: /dbraw.php (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff2bb167670>: Failed to establish a new connection: [Errno 111] Connection refused'))
   FAIL for http://tfb-server:8080
     Server did not respond to request
     See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
--------------------------------------------------------------------------------
VERIFYING JSON
--------------------------------------------------------------------------------
Accessing URL http://tfb-server:8080/json.php: 
Verifying test json for php-unit caused an exception: HTTPConnectionPool(host='tfb-server', port=8080): Max retries exceeded with url: /json.php (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff2bb166bc0>: Failed to establish a new connection: [Errno 111] Connection refused'))
   FAIL for http://tfb-server:8080
     Server did not respond to request
     See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
--------------------------------------------------------------------------------
VERIFYING QUERY
--------------------------------------------------------------------------------
Accessing URL http://tfb-server:8080/dbquery.php?queries=2: 
Verifying test query for php-unit caused an exception: HTTPConnectionPool(host='tfb-server', port=8080): Max retries exceeded with url: /dbquery.php?queries=2 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff2bb54be80>: Failed to establish a new connection: [Errno 111] Connection refused'))
   FAIL for http://tfb-server:8080
     Server did not respond to request
     See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
--------------------------------------------------------------------------------
VERIFYING UPDATE
--------------------------------------------------------------------------------
Accessing URL http://tfb-server:8080/updateraw.php?queries=2: 
Verifying test update for php-unit caused an exception: HTTPConnectionPool(host='tfb-server', port=8080): Max retries exceeded with url: /updateraw.php?queries=2 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff2bb533b80>: Failed to establish a new connection: [Errno 111] Connection refused'))
   FAIL for http://tfb-server:8080
     Server did not respond to request
     See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
--------------------------------------------------------------------------------
VERIFYING FORTUNE
--------------------------------------------------------------------------------
Accessing URL http://tfb-server:8080/fortune.php: 
Verifying test fortune for php-unit caused an exception: HTTPConnectionPool(host='tfb-server', port=8080): Max retries exceeded with url: /fortune.php (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff2bb56e3e0>: Failed to establish a new connection: [Errno 111] Connection refused'))
   FAIL for http://tfb-server:8080
     Server did not respond to request
     See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
--------------------------------------------------------------------------------
VERIFYING PLAINTEXT
--------------------------------------------------------------------------------
Accessing URL http://tfb-server:8080/plaintext.php: 
Verifying test plaintext for php-unit caused an exception: HTTPConnectionPool(host='tfb-server', port=8080): Max retries exceeded with url: /plaintext.php (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff2bb56ee00>: Failed to establish a new connection: [Errno 111] Connection refused'))
   FAIL for http://tfb-server:8080
     Server did not respond to request
     See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements

What is tfb-server? Is that what's running Unit? Could it be a file descriptor exhaustion issue?

ac000 avatar Sep 08 '23 02:09 ac000

The last one is only a GH actions test: https://github.com/TechEmpower/FrameworkBenchmarks/actions/runs/6079079817/job/16491119686#step:9:5995

I think the problem is that Unit needs too much time to initialize all the forks(processes|workers).

joanhey avatar Sep 13 '23 15:09 joanhey

Before Unit was working without problems in the benchmark. Later it start to fail in the benchmark, but now it also fail in the GitHub action test.

joanhey avatar Sep 13 '23 15:09 joanhey

I think the problem is that Unit needs too much time to initialize all the forks(processes|workers).

Are you saying that Unit itself is failing somehow due to trying to start X amount of processes?

What is X in your case?

Do you have a simple reproducer config?

ac000 avatar Sep 13 '23 23:09 ac000