amazon-linux-2023 icon indicating copy to clipboard operation
amazon-linux-2023 copied to clipboard

[Bug] - Changing Scheduler to SCHED_RR not working

Open goznauk opened this issue 2 years ago • 10 comments

Describe the bug Changing Scheduler to SCHED_RR or SCHED_FIFO is not working. chrt sched_setscheduler or CPUSchedulingPolicy=rr in systemd service all fails because of Operation not permitted. I've tried it with sudo or root or sudo setcap cap_sys_nice+ep "$(readlink -f $(which something))" but none of them worked.

To Reproduce

$ sudo chrt --rr 10 ls

This fails with error chrt: failed to set pid 0's policy: Operation not permitted

Expected behavior I've expected run with root will success, but it also fails. Is it bug?

Additional context I'm novice linux user, so please notice me if I've missed something.

goznauk avatar Apr 05 '23 01:04 goznauk

I've found why. It's because of CONFIG_RT_GROUP_SCHED is enabled, unlike other common general-purpose kernels.

goznauk avatar Apr 05 '23 09:04 goznauk

This is correct, we are looking into it. There are ways to specify time allocation to the cgroups created by systemd, but it's cumbersome and it appears that other distributions disable CONFIG_RT_GROUP_SCHED instead. We're investigating what the best option is for AL2023. We'll either turn this off in a future release or document how to "work around" with systemd

ozbenh avatar Apr 05 '23 13:04 ozbenh

@ozbenh Thanks for reply!

Readme in systemd repository says that

We recommend to turn off Real-Time group scheduling in the kernel when using systemd. RT group scheduling effectively makes RT scheduling unavailable for most userspace, since it requires explicit assignment of RT budgets to each unit whose processes making use of RT. As there's no sensible way to assign these budgets automatically this cannot really be fixed, and it's best to disable group scheduling hence. CONFIG_RT_GROUP_SCHED=n

and very naive quick fix for this is disable RT throttling

sudo sysctl -w kernel.sched_rt_runtime_us=-1

Is this "safe" if I can control "not to starve"? or should I avoid this? I'm willing to have some "safe" workarounds, or next AL2023 build without CONFIG_RT_GROUP_SCHED :)

goznauk avatar Apr 05 '23 15:04 goznauk

@ozbenh Hey, do you have any news on it? We faced the same issue.

kuzaxak avatar Jul 29 '25 16:07 kuzaxak

It looks like our kernel team might have dropped the ball on that one. I'll poke internally.

ozbenh avatar Jul 29 '25 23:07 ozbenh

I just verified, CONFIG_RT_GROUP_SCHED is not set in our current 6.1 kernels, it looks like we disabled it a while back (in mid 2023). Are you still experiencing issues with current AMIs ? If yes you might need to tell us more details about your specific problem.

ozbenh avatar Jul 30 '25 10:07 ozbenh

Are you still experiencing issues with current AMIs ?

Seems so, without a path proposed in a ticket chrt isn't working on the machine. What kind of info would you need? I can perform some tests if needed.

kuzaxak avatar Aug 07 '25 12:08 kuzaxak

@goznauk I will follow up with this reported issue about sched_rt.

I tried to reproduce the issue your reported. I don't see the same error as you experienced.

  • on AL2023 v6.12
grep CONFIG_RT_GROUP_SCHED /boot/config-6.12.37-61.105.amzn2023.x86_64
# CONFIG_RT_GROUP_SCHED is not set

uname -r
6.12.37-61.105.amzn2023.x86_64

sudo strace chrt --rr 10 ls 2>&1 | grep sched
sched_get_priority_min(SCHED_RR)        = 1
sched_get_priority_max(SCHED_RR)        = 99
sched_setscheduler(0, SCHED_RR, [10])   = 0
  • on AL2023 v6.1
uname -r
6.1.131-143.221.amzn2023.x86_64

grep CONFIG_RT_GROUP_SCHED /boot/config-6.1.131-143.221.amzn2023.x86_64
# CONFIG_RT_GROUP_SCHED is not set

sudo strace chrt --rr 10 ls 2>&1 | grep sched
sched_get_priority_min(SCHED_RR)        = 1
sched_get_priority_max(SCHED_RR)        = 99
sched_setscheduler(0, SCHED_RR, [10])   = 0

Could you add more information about the Amazon Linux instance? E.g. the kernel version, 6.1 or 6.12 Did you customize the Amazon kernel?

The fundamental question, why do you need the control group of real time scheduler (CONFIG_RT_GROUP_SCHED)? Why the default CFS scheduler (completely fair scheduler) control group (CONFIG_FAIR_GROUP_SCHED) can't meet your use case? Could you give us more details about your special use case on CPU scheduler?

yifei-aws avatar Aug 07 '25 22:08 yifei-aws

In addition, when you ran the chrt, is the user in any cgroup?

cat /proc/self/cgroup
0::/user.slice/user-1000.slice/session-1.scope

yifei-aws avatar Aug 07 '25 22:08 yifei-aws

OK, read the original reported issue again. What I read is you are not complaining a process can't be set to be scheduled by real-time scheduler, like your simple reproducer, you are saying you are not able to set a task in a systemd cgroups to the real-time scheduler. FYI, real-time scheduler is always available not behind any configuration.

So, can you provide more about your reproduce steps when using systemd cgroups ?

yifei-aws avatar Aug 07 '25 22:08 yifei-aws