youki icon indicating copy to clipboard operation
youki copied to clipboard

Decide which architectures to support, etc.

Open utam0k opened this issue 3 years ago • 33 comments

  • Distros: which distros do are we expect to support at which versions? Different distros have different system libraries and kernels.
  • Kernel: which minimum kernel version are we supporting? I know we talked about this, but maybe just getting something official and written down somewhere.
  • Architecture: Some architectures might not support certain features, which architectures should we consider supporting? Obviously we are only really considering x86_64 at the moment, but should we consider things like ARM for embedded and possibly even other MIPS, powerPC, and RISC V?
  • OS: Obviously we currently only support Linux, but maybe we can put that in writing and maybe discuss ideas or plans to support other operating systems?

Goal

Determine and describe these in README, etc.

utam0k avatar Oct 07 '21 13:10 utam0k

We should also differentiate short term and long term, perhaps? Limiting what we support in short term can allow us to focus on building new features. For long term, we can be more flexible.

yihuaf avatar Oct 08 '21 02:10 yihuaf

Distros: Ubuntu, Debian, Fedora for short term, OpenSUSE, RHEL(or CentOS), Arch for long term. Kernel: How about 5.4, that is the first LTS version for kernel 5.x Architecture: x86_64; ARM support should be a goal, but not near term OS: Windows is out of the question because it implements containers completely differently. MacOS and the BSDs are also sufficiently different that it would not make sense to support them in youki I believe.

Furisto avatar Oct 08 '21 10:10 Furisto

Sorry, wrong button :sweat_smile:

Furisto avatar Oct 08 '21 10:10 Furisto

Distros: Ubuntu/Debian and Fedora for near term. Likely we may want to support CentOS in the future, since it has a large server share as well. But CentOS usually carries an older kernel, so we have to be careful when is a good time to commit. Kernel: 5.4 is reasonable. We can also be flexible and bump this later. Looking into LTS version is a good idea. Architecture: I would focus on x86_64 OS: We should focus on Linux.

yihuaf avatar Oct 08 '21 18:10 yihuaf

It might be worth supporting an embedded focused distro also since I think youki is attractive in that space. Perhaps that should come when we take the time to focus on ARM support?

CentOS and RHEL support kernel version 4.18 at the latest currently I believe. Debian 11 is 5.10. Ubuntu 20.04 is actually 5.11 now. Latest Arch is 5.14. OpenSUSE has a rolling and stable release channel, the Leap 15.3 release is on 5.3 I believe.

I'd have to look more on the 4.18 kernel, but I'd have to see if we have conflicts with supporting a kernel that old. Cgroups v2 support is optional and doesn't block builds, the newer libseccomp versions might not be supported by that kernel, eBPF might have some conflicts as well which might cause build issues. It's possible to try to hide somethings like cgv2 and eBPF behind a build flag, especially if we're building binaries for platforms that don't even support cgv2. I think for libseccomp and libsystemd we might need to consider what our minimum supported version is. It might even be nice to try to limit our use of C libraries and try to move some things into Rust. I think for libsystemd we are using one function from the whole lib, and honestly I'd like to see youki not be tied directly to systemd so we can eventually easily support embedded platforms, and I think if we made our own systemd functionality we could disable it at runtime rather than having libsystemd build time dependencies. Nonetheless I think we can probably implement that singular systemd function in pure Rust. The libseccomp dependency is certainly harder to move away from, and I believe it's LGPLv2 which is kind of a viral license which makes porting or distributing it tricky.

I agree to focus on x86_64, though I actually don't doubt that youki would work on an aarch64 platform currently. I also think we should focus pretty exclusively on Linux since other Unix-like systems have so many differences that almost no part of youki would really be reusable for a platform like FreeBSD or MacOS.

tsturzl avatar Oct 08 '21 21:10 tsturzl

Anyone have a RHEL or CentOS machine they can try Youki out on? Otherwise I might try to get a VM setup and see if trying to support 4.18 kernel is worth while. It will definitely limit our ability to embrace new kernel features, which I'd really like to do, but even a 5.4 kernel will set us back quite a bit in that whereas some of the interesting kernel features I've experimented with for youki have only been available since 5.10.

tsturzl avatar Oct 08 '21 21:10 tsturzl

Also, we need to conditionally setup fields depending on which architecture we support: For example in State of container, pid is of data type i32, which is valid for 32-bit system, but can go wrong for 64 bit systems. Similarly we might need to define the correct type for different architectures.

To check what can be the maximum value of pid ,we can check /proc/sys/kernel/pid_max , which states 4194304 for my 64-bit system, which is well beyond 32768 for the 32-bit platforms. Such incorrect types can cause issues when parsing data from the state or config files.

YJDoc2 avatar Oct 12 '21 10:10 YJDoc2

Also, we need to conditionally setup fields depending on which architecture we support: For example in State of container, pid is of data type i32, which is valid for 32-bit system, but can go wrong for 64 bit systems. Similarly we might need to define the correct type for different architectures.

To check what can be the maximum value of pid ,we can check /proc/sys/kernel/pid_max , which states 4194304 for my 64-bit system, which is well beyond 32768 for the 32-bit platforms. Such incorrect types can cause issues when parsing data from the state or config files.

This is a good point and also the reason why I advocate us to limit what we want to support, to avoid complexity if we can.

yihuaf avatar Oct 13 '21 08:10 yihuaf

In terms of kernel version, libseccomp will require us to have a newer kernel likely. Worst case, we disable seccomp for kernel version that doesn't support this? As mentioned before, I don't recommend rewriting libseccomp logic in pure rust, if we can avoid it. This will be use case driven as well, tbh. Not all container runtime focus on security and take advantages of features like seccomp.

yihuaf avatar Oct 13 '21 08:10 yihuaf

@yihuaf @tsturzl @Furisto @YJDoc2 Thanks for all input. As for OS support, only Linux is fine. However, this should be considered when someone is motivated to support other OSes. I think it's worth considering. Let's set s kernel version to 5.4 for once. However, it should be noted that the supported version may be increased if new features using io_uring are introduced. There is a good chance that 4.18 will work, but lets' call it unofficail; I don't it's worth verifying every time with CI, etc. As for the distribution, as long as you decide on the version of the Linux kernel, there seems to be no problem. How about not writing anything specific about supported distributions for once? I don't feel that the differences between them will cause any current problems. In summary

  • Kernel: ≧ 5.4(May raise in the future)
  • OS: Linux

I am wondering about the architecture to support.

utam0k avatar Oct 16 '21 09:10 utam0k

Agree that we should increase the min kernel version supported in the future and we should not be afraid to do so. As far as architecture is concerned, I think we should focus on x86-64 first. I know many may be interested in Arm. Most of what we do are not architecture dependent, so supporting it should not be hard. I think other architectures are a nitch and accessing hardware of other architecture is hard.

yihuaf avatar Oct 17 '21 03:10 yihuaf

I think we probably already support arm64, but I don't really have a good setup to test that. If we didn't link against C libs it'd be easy to just cross compile. I might actually be able to test this out on a embedded system one of these days, but that said I think we should focus on x86_64 until we are feature complete with runc. I agree with @yihuaf, we probably already support other architectures since we don't really do anything architecture specific. I wouldn't, however, call arm64 a niche platform either, it's being use a lot and I think Youki might be especially appealing on these embedded or edge computing platforms due to it's likely lower footprint.

tsturzl avatar Nov 10 '21 05:11 tsturzl

I think we probably already support arm64, but I don't really have a good setup to test that. If we didn't link against C libs it'd be easy to just cross compile. I might actually be able to test this out on a embedded system one of these days, but that said I think we should focus on x86_64 until we are feature complete with runc. I agree with @yihuaf, we probably already support other architectures since we don't really do anything architecture specific. I wouldn't, however, call arm64 a niche platform either, it's being use a lot and I think Youki might be especially appealing on these embedded or edge computing platforms due to it's likely lower footprint.

Agree arm64 is not that nitch, and will become bigger and bigger in the future. To reiterate our decision,

  • Kernel: ≧ 5.4(May raise in the future)
  • OS: Linux
  • Architecture: x86_64. (arm64 in the future when when have the use-case)

yihuaf avatar Nov 11 '21 05:11 yihuaf

I also think that arm64 should be supported.

utam0k avatar Nov 11 '21 11:11 utam0k

I think this will be fine for the first release. How about you guys?

Kernel: ≧ 5.4(May raise in the future) OS: Linux Architecture: x86_64. (arm64 in the future when when have the use-case)

utam0k avatar Nov 11 '21 11:11 utam0k

@utam0k looks mostly good, but my concern with distros is that some distros meet that criteria and still Youki would not run or compile on them because they do not have a suitable version of libseccomp, or in some cases like Alpine Linux there is no systemd and thus there is no systemd library to link against. So if we don't want to target specific distros then we should at least know the dependencies and their versions.

I'd like to put libsystemd behind a feature flag so we can choose whether or not we want to compile it into a release. In fact we used to have this, because I used to run a disto without systemd so I added a feature flag, but splitting Youki up into several crates has broken that feature flag and made it difficult to get working again. Maybe this isn't a concern for now, but we should at least specify that we have required dependencies, at least until we have feature flags to make them optional.

tsturzl avatar Nov 13 '21 03:11 tsturzl

On a side note. I'd be very interested in taking on the arm64 support once I have some time. I have worked a lot with arm64 boards both in C++ and Rust. We use arm processors a lot at my work, as I work largely with robotics and embedded devices.

tsturzl avatar Nov 13 '21 04:11 tsturzl

We are using containers on the edge in mobile IoT/telematics applications (oats-center/isoblue-avena). This project hits home with that use case, but, unfortunately, often calls for ARM/ARM64.

abalmos avatar Nov 27 '21 14:11 abalmos

@tsturzl I'm sorry for the delayed reply. I think we should solve this problem. Could I ask you to create an issue about this and describe now status?

I'd like to put libsystemd behind a feature flag so we can choose whether or not we want to compile it into a release. In fact we used to have this, because I used to run a disto without systemd so I added a feature flag, but splitting Youki up into several crates has broken that feature flag and made it difficult to get working again. Maybe this isn't a concern for now, but we should at least specify that we have required dependencies, at least until we have feature flags to make them optional.

utam0k avatar Nov 28 '21 12:11 utam0k

We are using containers on the edge in mobile IoT/telematics applications (oats-center/isoblue-avena). This project hits home with that use case, but, unfortunately, often calls for ARM/ARM64.

@abalmos Thanks for your advice. hmm... I don't have any way to prepare for the environment. Do you have any good ideas?

utam0k avatar Nov 28 '21 12:11 utam0k

@utam0k I tend to use virtual machines as a starting pointing. I could potentially offer access to some of our devices for testing, but that may not be a long term solution.

abalmos avatar Nov 30 '21 19:11 abalmos

@utam0k I tend to use virtual machines as a starting pointing. I could potentially offer access to some of our devices for testing, but that may not be a long term solution.

@abalmos Youki probably works on arm/arm64, but I can't guarantee it because we can't confirm it works with CI in its current state :sob:

utam0k avatar Dec 01 '21 11:12 utam0k

Hey, I was looking around on how can we test this, and there are two options I have found, but none of them seems much reasonable :

  • Github does not allow to specify arch in actions and CI/CD, but it allows using an external runner for CI jobs. This will require someone to set up an external account on something like Azure, which does allow selecting architecture, and possibly we can do this in its free tier, but not sure. Then setup this as the job runner for a dedicated arm job.
  • Use QEmu in CI for emulating ARM and test. Even though this would work, and allow us to test multiple ARM based CPU hardware, this will not only be tedious to set-up, but will also have a high overhead. This way will basically require needing to download/cache a lightweight linux image which matches with our required kernel, and supports libs that we need for youki, and has integration test dependencies and Rust (if we want to run unit tests) then compile youki to ARM target and copy that as binary data to a file which is to be used as harddisk for the QEmu (there may be option to access host system files, but I don't know about that). Then we will need to run all the test, and capture the output from QEmu. As said before, this will be very tedious to set-up in CI/CD and will take quite a long time to run.

A third option is having Travis CI set up and move all our CI to that. This might allow us to setup something like Bors as the Rust repo does, but not sure about the cost, and how it compares to github CI.

YJDoc2 avatar Dec 01 '21 16:12 YJDoc2

@utam0k I've created #512 to address the libsystemd feature flag.

tsturzl avatar Dec 01 '21 16:12 tsturzl

@YJDoc2 @utam0k

Another option for ARM is something like tiered support. We consider ARM64 a tier 2, and maybe ARMv6/7 (32bit) as tier 3 support. Of course x86_64 would be tier 1 support, meaning that CI is actively testing all code against x86_64 for every PR before it is merged into main, and therefore we guarantee that main will always be work in at least a development capacity on main. Then for tier 2 we ensure that each actual release will be tested and built for ARM64, meaning that all tier 2 targets will be fully tested and built for each release meaning that whoever cuts the release just needs to make sure someone fully tested and built for those targets. Tier 3 support would mean we test and built for these targets as time permits, and the only support we offer is security updates and we would otherwise mostly rely on community interest and support to add features to this tier.

Eventually we could move ARM64 into tier 1 support, since Youki is an appealing choice for embedded platforms. This also means we don't have to put too much thought into tooling or immediate CI solutions for ARM64 currently because we don't even have an initial release yet.

Another thing to consider here in terms of platforms we support is that different platforms use different C libraries. For example Alpine Linux, a popular embedded distro, uses musl as it's C lib. Currently both x86_64-unknown-linux-musl and aarch64-unknown-linux-musl only have tier 2 rust support, which may effect our ability to support these platforms in some certain cases especially in CI. Perhaps more important to the discussion, is do we want to support musl? Or should we just focus on architecture support for now? I would assume we already support musl, but again we don't know until we actually test it.

tsturzl avatar Dec 01 '21 17:12 tsturzl

I also have experience with Qemu for ARM64. I had done something similar to test embedded software written in Rust that was targeted for a ARM64. If that's a route we want to try I'd be willing to try it out, however I should make mention that running unit tests in CI in an emulator will be terribly slow, because that was exactly my experience with it. It's also very likely that the Github Actions are already being run in a VM so it may not work incredibly well. AWS also offers a free tier and has ARM64 instances.

tsturzl avatar Dec 01 '21 17:12 tsturzl

@tsturzl I think a tier based solution with ARM tier 2 is a great solution ... but that still leaves the need for devs to test occasionally and debug issue reports.

I also share the same experience with qemu in Github Actions ... it ends up being a /very/ slow pipeline (we used it through docker buildx) ... I would think just building youki may even exceed the maximum run time (which is what we hit first).

Could use the AWS free tier ARM instance running a GHA self-host runner, then no emulation is needed anywhere and things can stay in GHA.

abalmos avatar Dec 02 '21 21:12 abalmos

@abalmos I'd agree that's probably a better solution here. We should probably create a ticket for this, and should probably discuss who creates the AWS resource. I'm wondering if @utam0k should create it since he's an actual containers org member.

tsturzl avatar Dec 02 '21 21:12 tsturzl

@tsturzl @abalmos I was looking at github actions of crun and found this. Maybe we can use it. How about this? https://github.com/containers/crun/blob/main/.github/workflows/test.yaml https://github.com/uraimo/run-on-arch-action

utam0k avatar Dec 02 '21 23:12 utam0k

I think run-on-arch-action is doing what @abalmos was saying with docker. Would youki tests run in a docker containers? I don't think the integration tests will run correctly like this. Compilation is really slow in an emulator, so I wonder how well this is working for crun. Maybe it's worth a try?

tsturzl avatar Dec 03 '21 00:12 tsturzl