runc icon indicating copy to clipboard operation
runc copied to clipboard

libct/seccomp: enable seccomp binary tree optimization

Open kolyshkin opened this issue 3 years ago • 19 comments

This makes libseccomp produce a BPF which uses a binary tree for syscalls (instead of linear set of if statements).

This should speed up doing syscalls from containers that were created from OCI spec containing lots of seccomp rules.

kolyshkin avatar Mar 08 '22 03:03 kolyshkin

Tests hung at

runc run [seccomp] (empty listener path) most probably due to newer libseccomp-golang.

Investigating.

kolyshkin avatar Mar 09 '22 02:03 kolyshkin

OK, this is not caused by the libseccomp-golang bump (as I have suspected), but by SetOptimize(2) call.

kolyshkin avatar Mar 10 '22 03:03 kolyshkin

The only test that hangs is runc run [seccomp] (empty listener path), others run fine.

This test does not set any seccomp rules; it also seems it is not named correctly.

kolyshkin avatar Mar 10 '22 04:03 kolyshkin

OK, so the key is to not enable binary tree optimization when there are no rules.

Also, it makes little sense to enable binary tree optimization when there are just a few rules. I arbitrarily chose 8 as the value of "a few" here.

kolyshkin avatar Mar 14 '22 21:03 kolyshkin

OK, so the key is to not enable binary tree optimization when there are no rules.

This is because of https://github.com/seccomp/libseccomp/issues/370

kolyshkin avatar Mar 14 '22 21:03 kolyshkin

The only test that hangs is runc run [seccomp] (empty listener path), others run fine.

This test does not set any seccomp rules; it also seems it is not named correctly.

Opened https://github.com/opencontainers/runc/issues/3415 wrt test name and purpose.

kolyshkin avatar Mar 14 '22 21:03 kolyshkin

This should speed up doing syscalls from containers that were created from OCI spec containing lots of seccomp rules.

Any benchmark?

AkihiroSuda avatar Mar 15 '22 05:03 AkihiroSuda

Any benchmark?

Some are available from https://github.com/seccomp/libseccomp/pull/152/commits/c9df08870b87bf81f037d1df30eab83050261ff6

kolyshkin avatar Mar 15 '22 22:03 kolyshkin

Rebased.

kolyshkin avatar Mar 23 '22 01:03 kolyshkin

Any benchmark?

Some are available from seccomp/libseccomp@c9df088

@AkihiroSuda the above links has a short test result. If that is not enough, please see https://github.com/seccomp/libseccomp/issues/116 and https://github.com/seccomp/libseccomp/pull/152

I tried to do my own tests but apparently measuring syscall latency is some sort of dark art I am not very good at, and later I got carried away with the other stuff.

kolyshkin avatar Mar 23 '22 01:03 kolyshkin

Rebased to resolve a conflict due to #3441. @AkihiroSuda @cyphar PTAL

kolyshkin avatar Mar 31 '22 03:03 kolyshkin

@opencontainers/runc-maintainers PTAL

kolyshkin avatar Apr 14 '22 01:04 kolyshkin

Rebased on top of merged #3465

kolyshkin avatar May 19 '22 22:05 kolyshkin

Needs a small comment explaining why > 8 is the threshold.

perhaps defining a const for it (where the const is documented) would be nice (just a comment would of course work).

thaJeztah avatar Aug 02 '22 10:08 thaJeztah

  • Rebased to resolve conflicts.
  • Added a (somewhat weak) explanation of why 8 is used as a threshold (copy/pasting it here):
       // Enable libseccomp binary tree optimization. The number 8 is chosen
       // semi-arbitrarily, considering the following:
       // 1. libseccomp <= 2.5.4 misbehaves when binary tree optimization
       // is enabled and there are 0 rules.
       // 2. All known libseccomp versions (2.5.0 to 2.5.4) generate a binary
       // tree with 4 syscalls per node.
       if len(config.Syscalls) > 8 {
               err = filter.SetOptimize(2)
               ....

@drakenclimber perhaps you have a better idea of when we should use bintree optimization (without introducing a knob for it, that it).

kolyshkin avatar Aug 04 '22 01:08 kolyshkin

  • Rebased to resolve conflicts.

    • Added a (somewhat weak) explanation of why 8 is used as a threshold (copy/pasting it here):
       // Enable libseccomp binary tree optimization. The number 8 is chosen
       // semi-arbitrarily, considering the following:
       // 1. libseccomp <= 2.5.4 misbehaves when binary tree optimization
       // is enabled and there are 0 rules.
       // 2. All known libseccomp versions (2.5.0 to 2.5.4) generate a binary
       // tree with 4 syscalls per node.
       if len(config.Syscalls) > 8 {
               err = filter.SetOptimize(2)
               ....

@drakenclimber perhaps you have a better idea of when we should use bintree optimization (without introducing a knob for it, that it).

The binary tree should help improve performance when the filter is large. Originally I had envisioned automatically turning it on when there were greater than X syscalls in the seccomp filter, but some users preferred to control optimizations themselves.

If you want to avoid a knob, then I would pick a somewhat arbitrary number, say 32, and enable the binary tree when the filter is larger than that.

drakenclimber avatar Aug 04 '22 22:08 drakenclimber

@drakenclimber perhaps you have a better idea of when we should use bintree optimization (without introducing a knob for it, that it).

So to actually answer your question :), no I don't have a better answer than what you've done.

drakenclimber avatar Aug 04 '22 22:08 drakenclimber

I would pick a somewhat arbitrary number, say 32, and enable the binary tree when the filter is larger than that.

Changed 8 to 32, slightly improved the comment.

kolyshkin avatar Aug 05 '22 01:08 kolyshkin

@cyphar LGTY?

AkihiroSuda avatar Aug 16 '22 23:08 AkihiroSuda

@cyphar LGTY?

Assuming @cyphar is good with this change, as his only request was t add an explanation for the number of rules threshold, which was added.

Let's merge it and face the consequences.

kolyshkin avatar Nov 01 '22 00:11 kolyshkin