bazel icon indicating copy to clipboard operation
bazel copied to clipboard

[Cgroups] Extended cgroup support

Open AlessandroPatti opened this issue 1 year ago • 9 comments

This extends the cgroup support in bazel to the cpu controller. Limits can be specified with the flag --experimental_sandbox_limits=<resource>=<limit>. Additionally, it is possible to get finer grained resource limits with --experimental_sandbox_enforce_resources=<regex>, which will cause all actions whose mnemonic matches the regex to use their allocated resource as limits, particularly useful for tests where resource allocation is controlled by the resources:*:* tags.

AlessandroPatti avatar Feb 13 '24 08:02 AlessandroPatti

Opened this as a draft to gather feedback before I go ahead and rebase/push tests. We've been using this change for a while now and have a couple more follow prs relative to resource utilization collection, which I'll open if this gets enough traction. cc @larsrc-google and @zhengwei143 who have worked in this area in the past.

AlessandroPatti avatar Feb 13 '24 08:02 AlessandroPatti

I learned that Lars is no longer working in this area, so cc @wilwell as well.

fmeum avatar Feb 19 '24 12:02 fmeum

Overall I do like many aspects in the structuring of this code w.r.t cgroups, but I would like this to eventually be upstreamed into CgroupsInfo and CgroupsInfo{V1,V2}. That'll involve some restructuring of the existing code because the current assumption is that we only care about the memory controller.

I appreciate the effort here in modelling the classes to reflect the different implementations in hierarchy between V1 and V2. Some overall directions to work towards if you're attempting to upstream this into CgroupsInfo (you've already done so in most cases):

  • A single CgroupsInfo should encapsulate all controllers regardless of the implementation (v1 or v2), even if they have different paths.
  • Try to separate as much v1 / v2 cgroups creation logic as possible for future maintainability.
  • We want to be able to handle failures gracefully, ignore errors with a warning; this is quite susceptible to permissions issues, e.g. if the Bazel process were to start up in a root (user) cgroup, we won't have the proper write permissions to create any sub-cgroups. Since this is mainly build performance related, I would rather not break the build due to such issues.

Left some side-comments throughout the code just to start a discussion first (don't need to spend time working on them atm).

zhengwei143 avatar Feb 19 '24 14:02 zhengwei143

Yes, this is opaque to all the users of VirtualCgroup, which see all the controller as if they come from the same (virtual) cgroup :)

That's what I love about your solution :) Really abstracts away the nitty gritty of cgroups from the API.

I think this is already the case, with the exception of the VirtualCgroup class, which has to expose both v1 and v2 cgroups transparently so has to deal a bit with their differences. I can try and break out some if/else, but I don't think it will be possible to break it out much more than that.

W.r.t cgroups creation I think that CgroupsInfo{V1,V2} handles that well. TBH, we don't need the complexity of modelling it after a generic tree, the structure of cgroups hierarchy should be pretty much fixed without callers being able to do anything crazy with it.

I think moving forward, we could work the concept of Controller(s) into each CgroupInfo instance. How does that sound?

Makes sense. I've also faced similar challenges in setting up the right enviroment for this feature to work. I've been considering adding a new startup option that will have the client start the server in the right cgroup (something on the lines of bazel --cgroup-parent /foo/bar` build ...), so that you just need to worry about ensuring that user-writable cgroup exists. WDYT?

I've actually been wanting to do that as well! One idea was using systemd-run --scope --user, but I haven't figured out the case where systemd isnt being used yet.

zhengwei143 avatar Feb 20 '24 11:02 zhengwei143

I think this is already the case, with the exception of the VirtualCgroup class, which has to expose both v1 and v2 cgroups transparently so has to deal a bit with their differences. I can try and break out some if/else, but I don't think it will be possible to break it out much more than that.

W.r.t cgroups creation I think that CgroupsInfo{V1,V2} handles that well. TBH, we don't need the complexity of modelling it after a generic tree, the structure of cgroups hierarchy should be pretty much fixed without callers being able to do anything crazy with it.

I tried to get it working with CgroupsInfo{V1,V2} but the problem is that this already specialises the cgroup for one or the other version. When dealing with multiple controllers, we could have an hybrid system, where some controllers are attached to the v2 hierarchy and other are instead v1, which would not fit this model. It works fine instead with a single VirtualCgroup class that defer the differences to the Controller interface and its implementations. I finally ended up replacing CgroupInfo{,V1,V2} with VirtualCgroup{,Factory} completely and extend it to cover the current logic because much easier than the other way around. I moved more logic into the controllers to handle the difference in v1 and v2, so the only part that still mixed the two version is the creation of the virtual cgroup (iterate over mounts and hierarchies and create the controllers), which I don't think can be improved further.

AlessandroPatti avatar Feb 22 '24 19:02 AlessandroPatti

Will review this next week - have more pressing work upgrading Bazel's Java language level to 21 (which will be very soon!).

Will also need to get this tested internally since we are relying on cgroups features in production but this refactor completely revamps the implementation - so it might take a while. Also considering whether we need to keep the original implementation and guard this behind a flag, will make a decision after I have a closer look at the code.

zhengwei143 avatar Feb 29 '24 10:02 zhengwei143

(I'm working on cleaning this up internally now).

zhengwei143 avatar Apr 10 '24 13:04 zhengwei143

@zhengwei143 do you have an update?

AlessandroPatti avatar May 14 '24 07:05 AlessandroPatti

I managed to clean most things up and am still going through internal review (will need to do more refactoring), that has stalled due to other more urgent work but I hope to pick it up soon. Sorry this has taken so long...

zhengwei143 avatar May 14 '24 08:05 zhengwei143

Update: Will push this through at the start of next week. I was working on replacing nio.file.Path with our internal vfs.Path but encountered some issues that I needed to work through. In the end, we decided it would be fine to submit with this current implementation (along with some clean-ups) and have the nio.file.Path -> vfs.Path migration come in a follow-up PR (just a heads up for you).

zhengwei143 avatar May 24 '24 12:05 zhengwei143

Thank you @zhengwei143!

AlessandroPatti avatar May 29 '24 10:05 AlessandroPatti