litmus-go icon indicating copy to clipboard operation
litmus-go copied to clipboard

Fix the cgroup 2 process attaching problem

Open kbfu opened this issue 1 year ago • 4 comments

What this PR does / why we need it: Fix the problem when attaching the process to another cgroup when using cgroup 2.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes # Fixed this issue. https://github.com/litmuschaos/litmus/issues/3902

Special notes for your reviewer:

Checklist:

  • [x] Fixes #litmus/issues/3902
  • [ ] PR messages has document related information
  • [ ] Labelled this PR & related issue with breaking-changes tag
  • [ ] PR messages has breaking changes related information
  • [ ] Labelled this PR & related issue with requires-upgrade tag
  • [ ] PR messages has upgrade related information
  • [ ] Commit has unit tests
  • [ ] Commit has integration tests
  • [ ] E2E run Required for the changes

kbfu avatar Dec 16 '23 11:12 kbfu

Hello @kbfu, thank you for your contribution through the pull request.

I would like to inquire about the specific cluster environment and container runtime where you have conducted your tests. We encountered an issue when running it on a GKE cluster with containerd and cgroupv2. Here's the error we observed:

could not get cgroup manager --- at /litmus-go/chaoslib/litmus/stress-chaos/helper/stress-helper.go:134 (prepareStressChaos) --- Caused by: Error in getting groupPath,nsenter: unrecognized option: C BusyBox v1.35.0 (2022-08-01 15:14:44 UTC) multi-call binary. Usage: nsenter [OPTIONS] [PROG ARGS] -t PID Target process to get namespaces from -m[FILE] Enter mount namespace -u[FILE] Enter UTS namespace (hostname etc) -i[FILE] Enter System V IPC namespace -n[FILE] Enter network namespace -p[FILE] Enter pid namespace -U[FILE] Enter user namespace -S UID Set uid in entered namespace -G GID Set gid in entered namespace --preserve-credentials Don't touch uids or gids -r[DIR] Set root directory -w[DIR] Set working directory -F Don't fork before exec'ing PROG

uditgaurav avatar Jan 03 '24 06:01 uditgaurav

Hi @uditgaurav , I rebuilt the image and replaced the base image from alpine to debian. I believe nsenter command from busybox was outdated. This is the version I am using now. nsenter from util-linux 2.38.1

kbfu avatar Jan 04 '24 03:01 kbfu

@kbfu, Thanks for your response. I'm wondering if we can integrate this capability within the Alpine-based image itself, as this would help in maintaining a smaller image size.

The corresponding version of util-linux package in Alpine is 2.38-r1.

For your reference, the experimental Dockerfile is located here - litmus-go Dockerfile. It uses the base image litmuschaos/experiment-alpine, sourced from this Dockerfile. Perhaps we can consider adding the required functionality in this Dockerfile.

uditgaurav avatar Jan 05 '24 07:01 uditgaurav

Hi @kbfu, I've created a test experiment image using the same Alpine-based image, which includes package util-linux 2.38-r1. The image can be found at docker.io/uditgaurav/go-runner:stress.

Current Output:

/ # nsenter --version
nsenter from util-linux 2.38

Previous Output:

~ $ nsenter --version
nsenter: unrecognized option: version
BusyBox v1.35.0 (2022-08-01 15:14:44 UTC) multi-call binary.

It works well with containerd + cgroupv2 🙌. Moving forward, I plan to conduct further tests under these scenarios:

  • [ ] containerd + cgroupv1
  • [x] docker + cgroupv2
  • [x] docker + cgroupv1
  • [x] Recursive experiment execution in both parallel and serial formats

Tagging @ispeakc0de, for any suggestions for additional tests or use cases for nsenter.

uditgaurav avatar Jan 05 '24 09:01 uditgaurav