runc icon indicating copy to clipboard operation
runc copied to clipboard

Recursively thaw freezer subcgroups during SIGKILL

Open percontation opened this issue 6 years ago • 29 comments

Fixes #2105

This PR is offered as an example of what I think would work. Unfortunately, it's kinda breaking some abstractions, and someone who knows the runC codebase could probably reorganize the code to better accommodate the additional code.

percontation avatar Aug 15 '19 07:08 percontation

So, this change could be refactored to not add ThawAll() as a method everywhere, and instead use the normal Set mechanism with a new configs.ThawedRecursive enum.

I think this would be less invasive and still fix the bug? Feedback on what would be preferred is welcome.

percontation avatar Aug 22 '19 22:08 percontation

@mrunalp any feedback? Happy to refactor this but want to know your thoughts first. Thanks!

mruck avatar Aug 29 '19 23:08 mruck

@hqhq any thoughts?

mruck avatar Sep 06 '19 22:09 mruck

I don't think runc is quite robust for nested containers, but for this fix, the idea looks good to me, child container should not block the exit of father container.

hqhq avatar Sep 07 '19 15:09 hqhq

Thanks for the feedback @hqhq! As @percontation mentioned above, instead of adding ThawAll() as a method everywhere, we could use the normal Set mechanism with a new configs.ThawedRecursive enum. Or did you have something else in mind? Just want to figure out how this can be improved so it can be merged in.

mruck avatar Sep 09 '19 19:09 mruck

@percontation @mruck Current change looks good to me, just needs rebase, thanks.

hqhq avatar Sep 10 '19 11:09 hqhq

@hqhq rebase done, let me know if there's anything else!

mruck avatar Sep 10 '19 22:09 mruck

LGTM

@mrunalp @crosbymichael Is it OK that we only add functionalities on legacy cgroup now?

Approved with PullApprove

hqhq avatar Sep 11 '19 01:09 hqhq

~It looks like the new cgroups objects have had the corresponding changes made for them as well here. (Not totally sure since I'm unfamiliar with the new cgroups stuff.) @mruck can you confirm?~ err wait maybe not quite.

percontation avatar Sep 11 '19 01:09 percontation

Updated so the change applies to cgroupsv2

mruck avatar Sep 11 '19 23:09 mruck

@hqhq let me know if anything else needs to be updated.

mruck avatar Sep 13 '19 20:09 mruck

LGTM

Approved with PullApprove

hqhq avatar Sep 15 '19 05:09 hqhq

Thanks! @mrunalp @crosbymichael any comments?

mruck avatar Sep 15 '19 22:09 mruck

@mrunalp @crosbymichael anything we can do to get this merged in?

mruck avatar Sep 23 '19 22:09 mruck

@mrunalp @crosbymichael hoping to get this merged in since we use this in production and @circosta looks like he ran into something similar

mruck avatar Sep 30 '19 20:09 mruck

My worry with this change is that people can setup sub-cgroups that maybe related to the container but not bound to its lifecycle and this would be a breaking change if we recursed down the tree and changed settings.

What usecase are you solving and how are you getting yourself into the position that is causing the error?

crosbymichael avatar Oct 01 '19 15:10 crosbymichael

My worry with this change is that people can setup sub-cgroups that maybe related to the container but not bound to its lifecycle and this would be a breaking change if we recursed down the tree and changed settings.

Yes, the recursive calls only apply in the SIGKILL case of container shutdown. So, this patch doesn't cause any recursive "change of settings" of any process that runC wasn't about to send a SIGKILL at anyways.

What usecase are you solving and how are you getting yourself into the position that is causing the error?

We have an application that uses runC containers. Attempting to run this application on kubernetes-based infrastructure means that we're running nested runC. As described in #2105 running nested runC can result in the outer runC deadlocking as it attempts to exit (and this extends to cases beyond just nested runC; it applies to any runC container containing an application that wants to make use of freezer cgroups).

percontation avatar Oct 01 '19 20:10 percontation

@percontation so this would be in the case of killing the pod container while another container ( inside the pod) is still frozen?

crosbymichael avatar Oct 01 '19 20:10 crosbymichael

Yes. (Although, in our case, we're not intentionally suspending containers. The temporary freezing and unfreezing that runC does while sending signals is enough to cause the problem to exhibit from time to time.)

percontation avatar Oct 01 '19 20:10 percontation

@crosbymichael is there anything else we can do to help so this gets merged?

mruck avatar Oct 14 '19 20:10 mruck

@mrunalp @cyphar can you help review this one as well?

crosbymichael avatar Oct 23 '19 15:10 crosbymichael

@mrunalp @cyphar can we get your feedback? really hoping to get this upstreamed since we are running into this in production on k8s

mruck avatar Nov 05 '19 20:11 mruck

@crosbymichael any chance you can help me get the attention of @mrunalp @cyphar?

mruck avatar Nov 18 '19 20:11 mruck

@mruck can you give this one more shot by rebasing?

h-vetinari avatar Mar 10 '20 07:03 h-vetinari

Rebases have not been entirely trivial since underlying cgroupsv2 code keeps changing, so it's probably not worth the effort until there's enough interest in reviewing the bugfix anyways ¯\(ツ)

percontation avatar Mar 11 '20 00:03 percontation

On the review side at least, there's one more new reviewer: @AkihiroSuda. He might be willing to give this a shot (in addition to the LGTM from @hqhq; will have to be reconfirmed).

h-vetinari avatar Mar 11 '20 00:03 h-vetinari

Ah, cool. Yeah, if things actually start moving here I'm happy to deal with the conflicts.

percontation avatar Mar 11 '20 00:03 percontation

I think we can take this in after 1.0 release. Nothing bad in thawing a container which we're going to kill anyway :)

kolyshkin avatar Feb 04 '21 23:02 kolyshkin

@percontation PTAL ^^^ (and this needs a rebase)

kolyshkin avatar Apr 08 '21 17:04 kolyshkin