FrameworkBenchmarks icon indicating copy to clipboard operation
FrameworkBenchmarks copied to clipboard

Framework fails to start containers on systems using runc >=1.0.0.rc93

Open johnaohara opened this issue 3 years ago • 8 comments

Running tfb across multiple machines, using host networking, fails on systems that have runc >= 1.0.0.rc93 installed

Setting any net. kernel options in a hosts default network namespace is bug which was fixed in runc-1.0.0.rc93;

https://github.com/opencontainers/runc/commit/09523b79d01d032d2faeeb7eef4ad5a31eda528f#diff-7f34ebfb08bb04cf9313703ad9d8e4ceee634a5ad8a412326c72333680b588feR164

OS (Please include kernel version)

Linux ******.redhat.com 4.18.0-240.el8.x86_64 #1 SMP Wed Sep 23 05:13:10 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux

Expected Behavior

It is expected that the following command will execute the tfb benchmark;

./tfb --mode benchmark --type plaintext --network-mode=host --server-host 192.168.0.1 --database-host 192.168.0.2 --client-host 192.168.0.3 --test quarkus

Actual Behavior

Any containers that are started with sysctl=net.* kernel configuration fail with the following error on the host:

e.g. https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/toolset/utils/docker_helper.py#L322

docker: Error response from daemon: OCI runtime create failed: sysctl "net.core.somaxconn" not allowed in host network namespace: unknown.
ERRO[0000] error waiting for container: context canceled 

The same error is displayed in the tfb output;

--------------------------------------------------------------------------------
Running Test: quarkus
--------------------------------------------------------------------------------
quarkus: Traceback (most recent call last):
quarkus:   File "/FrameworkBenchmarks/toolset/benchmark/benchmarker.py", line 139, in __run_test
quarkus:     test.database.lower())
quarkus:   File "/FrameworkBenchmarks/toolset/utils/docker_helper.py", line 338, in start_database
quarkus:     log_config={'type': None})
quarkus:   File "/usr/local/lib/python2.7/dist-packages/docker/models/containers.py", line 809, in run
quarkus:     container.start()
quarkus:   File "/usr/local/lib/python2.7/dist-packages/docker/models/containers.py", line 400, in start
quarkus:     return self.client.api.start(self.id, **kwargs)
quarkus:   File "/usr/local/lib/python2.7/dist-packages/docker/utils/decorators.py", line 19, in wrapped
quarkus:     return f(self, resource_id, *args, **kwargs)
quarkus:   File "/usr/local/lib/python2.7/dist-packages/docker/api/container.py", line 1093, in start
quarkus:     self._raise_for_status(res)
quarkus:   File "/usr/local/lib/python2.7/dist-packages/docker/api/client.py", line 263, in _raise_for_status
quarkus:     raise create_api_error_from_http_exception(e)
quarkus:   File "/usr/local/lib/python2.7/dist-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
quarkus:     raise cls(e, response=response, explanation=explanation)
quarkus: APIError: 500 Server Error: Internal Server Error ("OCI runtime create failed: sysctl "net.core.somaxconn" not allowed in host network namespace: unknown")
quarkus: Error during test: quarkus
quarkus: Total test time: 0s

Steps to reproduce behavior

Update runc to any version >=1.0.0.rc93

Other details and logs

tfb should not set net. kernel parameters when running in host network mode

It is possible to work round the current issue by downgrading containerd to containerd.io-1.4.3-3.1

johnaohara avatar Apr 15 '21 10:04 johnaohara

Hi,

I have a branch that contains a potential bug fix: https://github.com/johnaohara/FrameworkBenchmarks/tree/6538-sysctl

As this change has the potential for impacting results, by changing the kernel params set depending on network type, I wanted to be sure that there were no negative side effects before opening a PR.

I have looked in the contributing docs and the github action for running tests when a PR is opened, but can not see anywhere where host network mode is tested. Are there any tests that run specifically in host network mode, or is running in this mode manually validated?

Any help in validating these changes would be greatly appreciated

Thanks

John

johnaohara avatar Apr 15 '21 10:04 johnaohara

Hey @johnaohara,

Thanks for looking into this. Go ahead and open up the PR and we'll look at the changes and discuss there before doing any merges. We would have to test manually.

NateBrady23 avatar Apr 16 '21 20:04 NateBrady23

Hi @nbrady-techempower , pr created #6547 Thanks

johnaohara avatar Apr 22 '21 11:04 johnaohara

Thanks @johnaohara. It's merged, but going to leave this open to check on the results at https://tfb-status.techempower.com once the next full run starts. It'll be included in the run after Run ID: f2ce9b13-78dd-4d2c-b11d-47c641dddece

NateBrady23 avatar Apr 22 '21 19:04 NateBrady23

Thanks, sounds like a good plan

johnaohara avatar Apr 23 '21 05:04 johnaohara

@nbrady-techempower There have been a few runs with the changes merged, when would be a good time to review any impact? Thanks

johnaohara avatar May 11 '21 08:05 johnaohara

@nbrady-techempower any update about checking results after #6547 was merged? thanks

johnaohara avatar Aug 13 '21 12:08 johnaohara

@johnaohara i don’t think we saw any issues with results here. I’m out of office for a bit, but feel free to look at https://tfb-status.techempower.com if you have any concerns. There are a lot of maintainers that would have caught anything disruptive.

NateBrady23 avatar Aug 13 '21 14:08 NateBrady23