tuned icon indicating copy to clipboard operation
tuned copied to clipboard

Tuning of individual kernel threads

Open adriaan42 opened this issue 1 year ago • 18 comments

In combination with #596 and #580, this PR implements the third feature needed to dynamically tune all relevant aspects of a realtime application using a dedicated HW device (typically a NIC).

Two things might need some discussion:

  • In my implementation I kept the basic idea that one instace can cover a number of different groups of threads. That makes it easy to migrate from the current scheduler plugin, but means we still need _has_dynamic_options, which is marked as a hack in plugins/base.py. The alternative would be to have one plugin instance per "group", which would make the profiles much longer.
    group.ktimers=0:f:2:*:^\[ktimers
    
    would become something like
    [kthread_ktimers]
    type=kthread
    regex=^ktimers
    policy=fifo
    sched_prio=2
    affinity=*
    
  • I copied the approach of using perf to monitor for creation of new threads. That means that when running both the scheduler plugin and the kthread plugin, we'd have two threads doing the same thing. For my applications that's not a problem, because I no longer use the scheduler plugin at all:
    • scheduler handles three things: IRQ affinities, kernel threads, and userland threads
    • For IRQ affinities I can use the irq plugin
    • For kernel threads I can use kthread
    • For userland threads I use systemd and cgroupv2, and I don't want TuneD to touch them

adriaan42 avatar Apr 17 '24 14:04 adriaan42

but means we still need _has_dynamic_options, which is marked as a hack in plugins/base.py.

It's OK for me, in long-term it's a candidate for rewrite/refactor, but there are other plugins using it as well. We will probably keep the idea and if we change the implementation, this could be then updated in all affected plugins the same way.

yarda avatar May 23 '24 19:05 yarda

Regarding the cgroups, there is support for cgroups v1 in the scheduler plugin and we would also like to add support for the v2 for completeness. It could be useful for somebody.

It's OK if you are not using some plugin. We even wanted to add global configuration option allowing selective disablement of specific plugins in the stock profiles.

yarda avatar May 23 '24 20:05 yarda

Regarding the cgroups, there is support for cgroups v1 in the scheduler plugin and we would also like to add support for the v2 for completeness. It could be useful for somebody.

I found the whole cgroup topic to be rather tricky, because in modern systems, SystemD is the "cgroup manager", and it owns (by convention) the cgroup tree. So any creation of new cgroups should happen via SystemD, and can then use Delegation to create further sub-groups.

I've had some success with:

  • set AllowedCPUs on all the default slices (system.slice, user.slice, init.scope) to restrict all "normal" processes. This to some extent replaces the isolcpus= kernel option.
  • Create an isolated.slice using SystemD, with access to the desired CPUs, and then use Slice=isolated in my service file (or systemd-run --slice=isolated when launching from a shell) to gain access to the isolated CPUs.

But simply having TuneD move processes around seems like it could have unwanted side-effects, and should be handled with care...

adriaan42 avatar May 24 '24 10:05 adriaan42