runc
runc copied to clipboard
Add mon groups for resctrl.
On a system with RDT control features additional directories can be created in the root directory that specify different amounts of each resource. The root and these additional top level directories are referred to as "CTRL_MON" groups. And this is already implemented in runc.
On a system with RDT monitoring the root directory and other top level directories contain a directory named "mon_groups" in which additional directories can be created to monitor subsets of tasks in the CTRL_MON group that is their ancestor. And that PR implements this behavior.
More info: https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt
#2519
Signed-off-by: Paweł Szulik [email protected]
there's too much external dependencies for a couple of assert.EqualError calls in a test case. Can you please redo it without using stretchr/testify? I am not against using it in general, but for a small case like this it seems excessive.
sorry, didn't mean to close this one
A commit message with some details about what is being done, links to docs etc might be helpful, too.
@kolyshkin
A commit message with some details about what is being done, links to docs etc might be helpful, too.
This ^^^ is still valid.
CI build failed with following error "FAIL - commit subject exceeds 90 characters". @Creatone Could you fix this?
@kolyshkin @katarzyna-z
@crosbymichael @mrunalp @dqminh @hqhq @cyphar @AkihiroSuda @kolyshkin
Is it possible to review and merge this PR? I want to fix bugs with Resctrl metrics in the cAdvisor project and I just need this one.
@kolyshkin ptal
@kolyshkin
@Creatone Thank you for your effort on this PR. I am sorry I am on vacation now, and I will take a look at this PR soon (from Oct 9).
Conflicting files libcontainer/intelrdt/intelrdt.go update.go
@Creatone Could you help rebase the code? PR #2653 has been merged, you may need to make some change accordingly. Thank you.
But you may need to rebase again in future when PR #2643 is merged. My apologies. 😃
I'll take a look at it tomorrow
@Creatone
For this PR, I have some suggestions for your reference. The addition information may help other reviewers understand the code better:
- How about adding use case information in PR description or in runc documents?
- What does this feature offer for runc or OCI compatible upper layer? (container metrics: LLC occupancy, memory bandwidth)
- How to support this feature? runc command or config file?
- How to configure
groupType(control or monitoring group) when starting runc a container? - Any limitation? (adding a mon group under root, doesn't support allocation features at same time. No runc update command support?)
- How about submitting a separate commit to update documents for the proposal?
- Code description in
libcontainer/intelrdt/intelrdt.go libcontainer/SPEC.md(if any config change forlinux.intelRdt)runtime-spec/config-linux.md(if any change in runtime-spec/config)
- Test cases.
@xiaochenshen
- It's needed to introduce Monitoring Groups because of the limits of Control Groups number. If users want only to get monitoring statistics, they leave "l3CacheSchema" and "memBwSchema" empty. If users want to update SPEC, the group would switch automatically.
- Is it necessary to have updated docs in a separate commit?
- Can you check tests now?
@xiaochenshen
- It's needed to introduce Monitoring Groups because of the limits of Control Groups number.
Yes. I agree with you. This is the major improvement in this PR. It adds better support for: (1) Intel RDT monitoring only use case for runc container. (2) Much more monitoring groups could be created for more runc container instances at same time.
Thank you.
If users want only to get monitoring statistics, they leave "l3CacheSchema" and "memBwSchema" empty. If users want to update SPEC, the group would switch automatically.
What do you think if we could add a flag into the config linux.intelRdt to indicate either CTRL_MON group or MON group will be created for the container? Something like: "monitoringOnly": true/false?
- Is it necessary to have updated docs in a separate commit?
It is up to you.
- Can you check tests now?
I need some days to run some tests based on current implementation. But how about considering if we could have a more user-friendly config interface to support both two types of resource group?
What do you think if we could add a flag into the config
linux.intelRdtto indicate either CTRL_MON group or MON group will be created for the container? Something like: "monitoringOnly": true/false?
But how about considering if we could have a more user-friendly config interface to support both two types of resource group?
How about adding just a monitoring flag.
Scenarios:
"linux": {
"intelRdt": {
"monitoring": true
}
}
Create a monitoring group.
"linux": {
"intelRdt": {
"monitoring": true
"l3CacheSchema": "L3:0=7f0;1=1f",
"memBwSchema": "MB:0=20;1=70"
}
}
Create a control group. Monitoring stats available in GetStats function.
"linux": {
"intelRdt": {
"l3CacheSchema": "L3:0=7f0;1=1f",
"memBwSchema": "MB:0=20;1=70"
}
}
Create a control group. No monitoring stats in GetStats function
The update function will switch between groups dynamically.
What do you think if we could add a flag into the config
linux.intelRdtto indicate either CTRL_MON group or MON group will be created for the container? Something like: "monitoringOnly": true/false?But how about considering if we could have a more user-friendly config interface to support both two types of resource group?
How about adding just a
monitoringflag.
@Creatone
This flag in config file looks fine to me.
Do we need to add a new option for runc update command? Something like --enable-intelrdt-monitoring?
@xiaochenshen I mostly did all fixes but I found two scenarios that cause some problems.
I mean when after the first update:
- we have empty l3CacheSchema, memBwSchema and monitoring = false. Then the container should free the Resctrl group.
And after the second update:
- we have any data in l3CacheSchema, or memBwSchema, or monitoring = true -> Then the container should create the Resctrl group.
Do we need to add a new option for
runc updatecommand? Something like--enable-intelrdt-monitoring?
Yes
@xiaochenshen can you review :)?
I need also to update opencontainers/runtime-spec/specs-go/config.go on git.
Waiting for https://github.com/opencontainers/runtime-spec/pull/1076
XXX this is currently a draft/WIP; depends on opencontainers/runtime-spec#1076 being merged (added by @kolyshkin)
Tentatively set the milestone to 1.1.0 so we won't forget.
@kolyshkin Ok, thanks
@Creatone can you please rebase?
Moving to 1.2.0 milestone since https://github.com/opencontainers/runtime-spec/pull/1076 is not yet merged
@kolyshkin Since opencontainers/runtime-spec#1076 is merged, can we get back to 1.1.0 milestone?
Sorry for the force-push, CI had problems with the length of subject of merge commit :/
Commits like "merge master" should not be here. Please rebase properly.
@kolyshkin Thanks for review, PTAL now