kyua icon indicating copy to clipboard operation
kyua copied to clipboard

Add FreeBSD Jail execution environment support

Open ihoro opened this issue 1 year ago • 21 comments

The main work and review was conducted here: https://reviews.freebsd.org/D42350. The respective mailing list discussion: https://lists.freebsd.org/archives/freebsd-hackers/2024-February/003032.html.

ihoro avatar Feb 03 '24 16:02 ihoro

I've prepared the simplest conditional build to fix Kyua build for non-FreeBSD: https://github.com/ihoro/kyua/commit/8b50c22f196a93ad1a69b1709cb77b5fdc31534e. Let me know if we would like to have a separate freebsd/ src dir for FreeBSD specifics.

It could be the subsequent PR after this one is merged.

I've tested it with FreeBSD 14 RELEASE and Fedora Server 39.

ihoro avatar Feb 05 '24 12:02 ihoro

There is the second candidate for the next PR: https://github.com/ihoro/kyua/commit/4dffd2650f9bc59d77af32275d8ab73573443322.

It's a bigger patch, but it collects FreeBSD specifics into a separate freebsd/ src dir.

ihoro avatar Feb 05 '24 16:02 ihoro

And there is another commit in the queue to become a PR: https://github.com/ihoro/kyua/commit/0d319defa0bf19460a1baa1be06edd0275833f12. It gets Kyua tests back to the pre-patch state: Test cases: 1472 total, 22 skipped, 0 expected failures, 0 broken, 2 failed

ihoro avatar Feb 14 '24 23:02 ihoro

The following topics are closed by current state of the patch:

  • FreeBSD specifics are separated with conditional build.
  • The existing tests, broken by the change, are fixed.
  • The documentation update is added. kyua.conf.5 and kyuafile.5 man pages got the respective upgrades.

ihoro avatar Mar 22 '24 17:03 ihoro

@ngie-eign, thank you very much for your time. Going to work on the outcome of the review.

ihoro avatar May 05 '24 19:05 ihoro

As much as I'm not a fan of the term "container" because it focuses on the Linux/Solaris technological implementations, I believe that it would be wise to use "container" in lieu of "jail". "jail"s are a concept that are very FreeBSD specific and don't exist outside of FreeBSD/FreeBSD derivatives. I realize this project is hosted in the freebsd GitHub org, but it's a good idea to keep things as generic as possible to avoid having to rearchitect things later if someone contributes an actual Linux or Solaris container implementation (for instance).

Sounds interesting to consider. I have concerns that two levels of abstraction could add extra misunderstanding from end user perspective and extra interface limitations for the possible future execution environments added. The execenv is already meant to be the first abstraction added. It's abstracted from the concepts and details of a specific execution environment, i.e. it does not imply a containerization only. And currently it's designed as a switch, it could be anything (just thinking out loud): execenv=host, execenv=jail, execenv=bhyve, execenv=docker, execenv=kubernetes, execenv=python-virtualenv, execenv=aws, etc. And the key abstraction is that parameters are not generalized, due to it can be hard to do and it may introduce limitations only. For example, jail has a single metadata property -- execenv.jail, what is a string of parameters passed to the jail(8) program. Docker could have multiple properties like execenv.docker.imageurl and execenv.docker.runparams. AWS could have more execenv.aws.* properties to cover authorization and details of the exact service to be used.

There was no intention to provide an automatic containerization feature which uses the local capabilities like jail for FreeBSD or cgroups et al. for Linux. It simply provides a new concept, execution environment, and implements the first non-default (aka host) environment -- FreeBSD jail based one. It's merely an outcome that it can be used to add extra isolation of some FreeBSD tests to improve parallelism of the test suite. A test author is expected to understand the background if jail execution environment is picked for a test.

This is a short elaboration of the current vision. What do you think? Probably I have not caught accurately the idea with execenv=container and execenv.container.params.

ihoro avatar May 06 '24 20:05 ihoro

Anyway, now I think that execenv.jail could be re-worked into execenv.jail.params to keep the "namespace" for possible future improvements of jail execenv with extra execenv.jail.* metadata.

ihoro avatar May 07 '24 07:05 ihoro

@ngie-eign, I'm leaving the "Resolve conversation" button for you. I guess it must be convenient for you.

ihoro avatar May 09 '24 12:05 ihoro

@ngie-eign, I think it's ready for the next round of your review.

ihoro avatar May 13 '24 08:05 ihoro

The latest push introduces the following changes:

  • Rebase and update the code accordingly -- align with libtool usage
  • Rename execenv.jail to execenv.jail.params -- to keep execenv.jail.* open for future improvements
  • Rename execenv engine conf variable to execenvs -- to keep it consistent with the existing Kyua interface
  • Fix autotools based linkage due to recent introduction of new constants in the code

ihoro avatar May 27 '24 12:05 ihoro

I've moved freebsd/* to os/freebsd/. There is no os namespace added, it can be done in future if it's really needed. The main task is to ease work with the history -- by putting the files at the desired location from the very beginning.

ihoro avatar May 27 '24 17:05 ihoro

I agree with Igor's view about the container/jail naming. There are lots of interface differences between jails, cgroups, zones, etc.., and it's premature to generalize from a single backend implementation. If in the future someone shows up with an equivalent backend for a different OS, it may have constraints which mean that a generic "container" engine will need backwards-incompatible changes to support both jails and the new container type. Meanwhile, this feature provides major benefits to FreeBSD src developers today - with very little effort we can use this to speed up network tests substantially.

As a test, I integrated this PR into FreeBSD and tried running the pf tests with and without execenv=jail. In bhyve with parallelism=4, it takes ~960s without this patch (most tests are serialized because of jail name clashes), and ~360s with the patch. All I had to do was add two lines to a makefile. I believe kp@ has seen similar improvements.

@ngie-eign is there anything I can do to help speed up review here? I spent a fair bit of time looking at early iterations of the patch and would quite like to see this feature progress into FreeBSD. Should we perhaps import it directly into FreeBSD and refine it there, then reconcile the two copies of kyua later?

markjdb avatar Jun 04 '24 17:06 markjdb

@ihoro I believe there is some kind of memory leak in the patch. So far I narrowed it down to something in the FreeBSD sbin tests. If I run them with -v parallelism=4, something causes kyua to consume a huge amount of memory, triggering the OOM killer. In particular, nothing there is using execenv=jail. Please let me know if I can help narrow it down further.

markjdb avatar Jun 05 '24 02:06 markjdb

@ihoro I believe there is some kind of memory leak in the patch. So far I narrowed it down to something in the FreeBSD sbin tests. If I run them with -v parallelism=4, something causes kyua to consume a huge amount of memory, triggering the OOM killer. In particular, nothing there is using execenv=jail. Please let me know if I can help narrow it down further.

Thank you for the extra attention and additional testing. I guess there is no kyua report --verbose to share respectively.

I've tried main on aarch64@qemu with kldload pf and have not re-produced it yet with the following result: Test cases: 637 total, 46 skipped, 0 expected failures, 1 broken, 1 failed Working on the amd64@qemu trial.

Any detail of the case's environment would be helpful. Let me guess that probably bricoler has been used here then probably I will have higher chances to reproduce the same if you provide me with the invocation details. It could be an interesting testing of many things at the same time.

ihoro avatar Jun 05 '24 19:06 ihoro

Any detail of the case's environment would be helpful. Let me guess that probably bricoler has been used here then probably I will have higher chances to reproduce the same if you provide me with the invocation details. It could be an interesting testing of many things at the same time.

Please try cloning bricoler, then run make. Then:

$ ./bricoler run freebsd-src-regression-suite-run --param freebsd-src:url=https://github.com/markjdb/freebsd --param freebsd-src:branch=main-kyua-parallel --param freebsd-src-regression-suite-run:hypervisor=bhyve --param freebsd-src-regression-suite-run:tests=sbin

Note you'll need sudo to use bhyve for now (I just configure bhyve and bhyvectl in sudoers for now on my build system). If you omit the hypervisor parameter, it'll use QEMU instead, which also works, but it's slower.

While tests are running, you can ssh into the VM with bricoler run freebsd-src-regression-suite-run ssh. If I do that, I see this in top:

 2045 root        122    0    14G  5226M CPU2     2   0:15 100.00% kyua -v parallelism=4 

There are some sharp corners here, please feel free to report issues or make suggestions. If you can't make progress, email me and we can debug further. Maybe I made a mistake somewhere.

markjdb avatar Jun 05 '24 19:06 markjdb

@ihoro I believe there is some kind of memory leak in the patch.

Hmm, there is something strange going on. It seems to be related to incremental buildworlds that I use by default in my test runner. If I do a clean build, I can't reproduce any problems, but I still don't understand the root cause.

I will keep digging, but please ignore this for now. Sorry for the noise.

markjdb avatar Jun 07 '24 19:06 markjdb

@ihoro I believe there is some kind of memory leak in the patch. I will keep digging, but please ignore this for now. Sorry for the noise.

I'm pretty sure there is a latent bug in kyua somewhere, related to -v parallelization. I've been running the FreeBSD src test suite in a loop with parallel tests, collecting bug fixes (mostly test bugs/hidden assumptions), and I've seen a couple of OOM kills. Again, sorry for the noise, it seems unrelated to this patch.

I would still really like to see this land soon. On my test systems, with -v parallelization=4, the full test suite takes 3-4 hours to run, which isn't terrible but is rather long. I expect that to drop significantly with this patch and a small amount of work.

markjdb avatar Jun 10 '24 23:06 markjdb

I'm pretty sure there is a latent bug in kyua somewhere, related to -v parallelization. I've been running the FreeBSD src test suite in a loop with parallel tests, collecting bug fixes (mostly test bugs/hidden assumptions), and I've seen a couple of OOM kills. Again, sorry for the noise, it seems unrelated to this patch.

Thanks for the additional analysis and the test runs.

Just thinking out loud: probably, we could add an extra logic to bricoler to capture and report such cases, with collecting of kyua core dump and related files if necessary (like kyua and kyua.debug binaries). It could provide us with a clue to build hypothesis of the lurking defect. Kyua is an important tool for the FreeBSD Test Suite, hence it should not be a bad thing to have some logic hard-coded specifically for the kyua test runs.

In addition to that, we could limit the memory of the kyua process with 200MB or something. Just to make it crash sooner without waste of resources and time. Just a quick idea, currently I do not have intuition of normal memory usage spikes kyua may have, I saw something steady like 11MB for the sbin tests.

ihoro avatar Jun 12 '24 07:06 ihoro

I would still really like to see this land soon. On my test systems, with -v parallelization=4, the full test suite takes 3-4 hours to run, which isn't terrible but is rather long. I expect that to drop significantly with this patch and a small amount of work.

From the organizational point of view, my current vision is that only the following topics need closure before the merge:

  1. The comments raised by @ngie-eign . I have a feeling that I covered most of them in a single iteration, and around 10% of them may require another iteration of discussion/change.
  2. The execenv.container idea discussion is not resolved yet. My intuition is that @ngie-eign could accept our vision, i.e. it would not require additional work.

And I propose to move the following topics to the subsequent PRs:

  1. syntax(N) version bump question. I have a proposal to mark such tests as failed/whatever. Having kyua process failure during Kyuafile parsing stage looks less practical, when the whole test suite is rendered unusable. There are no troubles expected from the FreeBSD project perspective, @kprovost and I had a discussion of "MFC dilemma" and it looks like the MFC of the latest Kyua version should cover it, i.e. we won't have to balance between old and new Kyua.
  2. New tests of the Kyua itself to cover the changes.
  3. Kyua version bump. To allow downstreams to sync respectively, e.g. homebrew formula's maintainer could provide the respective upgrade if there is the next official version GA. But I have a feeling of some debt collected so far like Linux support repairing need, etc. This is definitely a separate topic.

ihoro avatar Jun 12 '24 08:06 ihoro

I'm pretty sure there is a latent bug in kyua somewhere, related to -v parallelization. I've been running the FreeBSD src test suite in a loop with parallel tests, collecting bug fixes (mostly test bugs/hidden assumptions), and I've seen a couple of OOM kills. Again, sorry for the noise, it seems unrelated to this patch.

Thanks for the additional analysis and the test runs.

Just thinking out loud: probably, we could add an extra logic to bricoler to capture and report such cases, with collecting of kyua core dump and related files if necessary (like kyua and kyua.debug binaries). It could provide us with a clue to build hypothesis of the lurking defect. Kyua is an important tool for the FreeBSD Test Suite, hence it should not be a bad thing to have some logic hard-coded specifically for the kyua test runs.

Absolutely. In fact, I occasionally see kyua crash with one of several assertions on arm64, also not understood yet. :(

sbin/nvmecontrol/basic:io_passthru  ->  passed  [2.780s]                                                                                                                                                                                                                                                                      
sbin/mdconfig/mdconfig_test:attach_vnode_non_explicit_type  ->  passed  [3.087s]                                                                                                                                                                                                                                              
*** /usr/home/markj/bricoler/freebsd-src/src/contrib/kyua/utils/optional.ipp:175: Precondition check failed: _data != NULL                                                                                                                                                                                                    
*** Fatal signal 6 received

I plan to extend the freebsd-src-regression-suite job to automatically fetch kyua.core and open it in a debugger. Similarly, the VM console watcher will soon be able to automatically detect kernel panics and generate a report of that.

In addition to that, we could limit the memory of the kyua process with 200MB or something. Just to make it crash sooner without waste of resources and time. Just a quick idea, currently I do not have intuition of normal memory usage spikes kyua may have, I saw something steady like 11MB for the sbin tests.

From watching it in top(1), I think there is some specific race condition which triggers a quick blowup of kyua's memory consumption. That is, memory usage is usually below ~12MB, and at some point it quickly starts growing.

markjdb avatar Jun 14 '24 14:06 markjdb

@ngie-eign do you have a vision how we could proceed with the Pull Request? Currently it's aligned with the code committed to the freebsd-src/main. It looks less productive to continue with the related improvements due to this is not the only diff now between freebsd-src:contrib/kyua and kyua.

ihoro avatar Aug 04 '24 10:08 ihoro

Apologies for being so incommunicado. I'll merge this change now to sync freebsd:main and this repo. It would be good to have followup issues for any items pointed out that haven't been resolved in this PR. Thanks!

ngie-eign avatar Sep 08 '24 05:09 ngie-eign

Thanks for your time and volunteer effort. The subsequent PRs should be much smaller and easier to discuss.

ihoro avatar Sep 09 '24 07:09 ihoro