runtime-spec icon indicating copy to clipboard operation
runtime-spec copied to clipboard

config-linux: add CFS bandwidth burst

Open kailun-qin opened this issue 3 years ago • 18 comments

Burstable CFS controller is introduced in Linux 5.14. This helps with parallel workloads that might be bursty. They can get throttled even when their average utilization is under quota. And they may be latency sensitive at the same time so that throttling them is undesired.

This feature borrows time now against the future underrun, at the cost of increased interference against the other system users, by introducing cfs_burst_us into CFS bandwidth control to enact the cap on unused bandwidth accumulation, which will then used additionally for burst.

The patch adds the support/control for CFS bandwidth burst.

Fixes https://github.com/opencontainers/runtime-spec/issues/1119

Signed-off-by: Kailun Qin [email protected]

kailun-qin avatar Aug 11 '21 03:08 kailun-qin

Dear spec maintainers @crosbymichael @cyphar @dqminh @giuseppe @hqhq @mrunalp @tianon @vbatts,

I received this feature request several times from runtime users of different orgnizations, would you please kindly take a look at this one so that we can move forward (also https://github.com/opencontainers/runc/pull/3205)?

Many thanks!

kailun-qin avatar Feb 28 '22 08:02 kailun-qin

~~In my opinion, I think cpu_burst mode is a cgroupv2 only feature. Is there any compatibility issue if we change all schema that is used in both cgroupv1 and v2?~~

Zheaoli avatar Mar 02 '22 07:03 Zheaoli

In my opinion, I think cpu_burst mode is a cgroupv2 only feature. Is there any compatibility issue if we change all schema that is used in both cgroupv1 and v2?

I see it is present with cgroup v1 too:

# ls /sys/fs/cgroup/cpu/cpu.cfs_burst_us
/sys/fs/cgroup/cpu/cpu.cfs_burst_us

If it was a cgroup v2 only feature then we could just use the unified map.

giuseppe avatar Mar 02 '22 08:03 giuseppe

In my opinion, I think cpu_burst mode is a cgroupv2 only feature. Is there any compatibility issue if we change all schema that is used in both cgroupv1 and v2?

I see it is present with cgroup v1 too:

Yes, it supports both cgroup v1 and v2 IMO.

@Zheaoli Any special reason for considering it as cgroup v2 only?

kailun-qin avatar Mar 02 '22 08:03 kailun-qin

Sorry for the bad information, my memory is wrong

I have confirmed the burst mode support both v1 and v2 since Linux 5.14, FYI https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html

Zheaoli avatar Mar 02 '22 09:03 Zheaoli

~~But for this, maybe we should think about the kernel issue? The last LTS kernel is 5.10 which is not supported for burst mode(People may use 510 with the newest version of runc . Maybe we should make burst mode as an experimental feature before the Linux 5.15 has been released~~

Zheaoli avatar Mar 02 '22 09:03 Zheaoli

But for this, maybe we should think about the kernel issue? The last LTS kernel is 5.10 which is not supported for burst mode(People may use 510 with the newest version of runc . Maybe we should make burst mode as an experimental feature before the Linux 5.15 has been released

The 5.15 kernel was released last year, and has already been marked as "LTS" for quite some time now. I do not understand why you are not using it now already :)

gregkh avatar Mar 02 '22 10:03 gregkh

The 5.15 kernel was released last year, and has already been marked as "LTS" for quite some time now. I do not understand why you are not using it now already :)

Many of our machines are using 510 not 515, We can upgrade the runc version easier but it's a tough way to upgrade the kernel version(And my bad again, 515 has been released in 2021-10-31).

Zheaoli avatar Mar 02 '22 10:03 Zheaoli

I think the spec is LGTM to me now.

For my personal idea, I'm very glad to see this spec will be accepted ASAP, because it's very useful for some performance-needed circumstances.

Zheaoli avatar Mar 02 '22 13:03 Zheaoli

Hello all, would you mind telling me if this PR is mergeable?

Zheaoli avatar Mar 09 '22 03:03 Zheaoli

Perhaps it would help me (and other spec reviewers) if we level-set on how the feature works. Here's my current understanding -- is this accurate?

there's at most quota+burst in any one given period, and any use of burst in a period counts against a future period's available quota

tianon avatar Mar 09 '22 23:03 tianon

Ping? Is there any update about this issue?

Zheaoli avatar Apr 22 '22 12:04 Zheaoli

ping

Zheaoli avatar Jun 19 '22 12:06 Zheaoli

Perhaps it would help me (and other spec reviewers) if we level-set on how the feature works. Here's my current understanding -- is this accurate?

there's at most quota+burst in any one given period, and any use of burst in a period counts against a future period's available quota

@tianon Please kindly take another look and see if it addresses the question. Thanks!

kailun-qin avatar Jun 20 '22 10:06 kailun-qin

@tianon ping.

Hello guys, is there any blocking issue with this pull request?

Zheaoli avatar Aug 18 '22 15:08 Zheaoli

LGTM, but please consider squashing commits

Squashed and rebased, PTAL, thanks!

kailun-qin avatar Sep 02 '22 05:09 kailun-qin

@tianon Hi is there any potential blocking issue for this PR?

Zheaoli avatar Sep 02 '22 13:09 Zheaoli

@tianon ping...

Zheaoli avatar Sep 12 '22 11:09 Zheaoli

Can we merge this? Looks ready...

kaffarell avatar Oct 02 '22 09:10 kaffarell

ping... I think it's ready for merge

Zheaoli avatar Nov 13 '22 09:11 Zheaoli

ping

Zheaoli avatar Dec 10 '22 03:12 Zheaoli

Is there any chance to merge this PR in 2022?(A lot of people really need this(

Zheaoli avatar Dec 23 '22 12:12 Zheaoli

ping, ping, ping...

Zheaoli avatar Jan 20 '23 09:01 Zheaoli

There is two LGTM from the people who have the write access to this repo. Do we need more LGTM?

Zheaoli avatar Jan 23 '23 10:01 Zheaoli