runtime-spec
runtime-spec copied to clipboard
Clarify Intel RDT configuration
First change: make it explicit when l3CacheSchema
should have lines with MB:
prefix filtered out. Previously it was unclear if this should be done always or only if memBwSchema
did exist.
Second change: clarify when the subdirectory needs to be removed after the container is finished (not sure if the config spec is the best place for this?).
Does this change require a change in runc behavior?
Yes it does... I don't think runc filters the MB:
-lines at all -- it just concatenates the values. Also for directory removal, runc removes the directory if and only if ClosID is not set. It doesn't take into account whether or not runc has created the directory to begin with.
I have looked at the major container runtime implementations, and only runc is affected. It would be good to have @AkihiroSuda, @kolyshkin, and @cyphar review this.
@giuseppe Just to be sure, crun doesn't implement Intel RDT, does it?
@giuseppe Just to be sure, crun doesn't implement Intel RDT, does it?
correct, it is not implemented in crun
@giuseppe Thanks for your quick response 🙏
I have looked at the major container runtime implementations, and only runc is affected. It would be good to have @AkihiroSuda, @kolyshkin, and @cyphar review this.
@giuseppe Just to be sure, crun doesn't implement Intel RDT, does it?
Thanks for looking at this! I implemented the required runc changes now (in a linked PR).
First change: make it explicit when l3CacheSchema should have lines with MB: prefix filtered out. Previously it was unclear if this should be done always or only if memBwSchema did exist.
This is kind of weird. Runtime is not a tool to groom configs. Why don't we just require that if both l3CacheSchema
and memBwSchema
exist, the first one should not contain lines with MB:
prefix? This would be way easier to implement in config validation (strings.HasPrefix(s, "MB:") || strings.Contains(s, "\nMB:")
) than the filtering.
Alternatively, if the kernel is OK with getting multiple MB:
strings, we can just write both l3CacheSchema
and memBwSchema
and the latter will naturally overwrite the former.
TBH I'm not sure why this is written the way it is now, but I think this may be due to originally L3 cache being the only supported resource (see https://github.com/opencontainers/runc/issues/433). The reason for l3CacheSchema
containing MB:
strings is probably for backwards compatibility during the transitioning from l3CacheSchema
to having both l3CacheSchema
and memBwSchema
.
My target with this PR was to clarify the spec from the implementation point of view, but I fully agree that removing the filtering would make things simpler. I would however lean more towards your first option (erroring out if there are conflicting values). Having the "silent override" would probably confuse users and cause hard-to-debug bugs.
Erroring out in case of conflict would be a breaking change though -- I don't know if there are any workloads which currently set MB:
strings from both variables. @marquiz since the previous wording is from your PR, do you have any insight or opinion into this?
Alternatively, if the kernel is OK with getting multiple
MB:
strings, we can just write bothl3CacheSchema
andmemBwSchema
and the latter will naturally overwrite the former.
I think this should actually be true
@kolyshkin @marquiz I now updated the PR to say that the schemata changes should be applied in order. This is in line with the -"natural override" proposal and also maybe what the user has in mind. This will be a breaking change however, because now we don't filter anything out. If the MB domains set in l3CacheSchema
are different than the domains in memBwSchema
, both will be set.
I now updated the PR to say that the schemata changes should be applied in order. This is in line with the -"natural override" proposal and also maybe what the user has in mind. This will be a breaking change however, because now we don't filter anything out. If the MB domains set in l3CacheSchema are different than the domains in memBwSchema, both will be set.
@ipuustin looking at this now, I'm inclined against a breaking change in the spec. Even though I don't think any runtime actually implements the filtering of data. But LGTM for the additions about directory removal
Related: with a "leave the old mess behind" mentality, I submitted PR #1230.
@kolyshkin PTAL