youki icon indicating copy to clipboard operation
youki copied to clipboard

Ideas - Feel free to post ideas.

Open utam0k opened this issue 3 years ago • 92 comments

Feel free to post ideas.

utam0k avatar May 17 '21 13:05 utam0k

Hey, this seems like a really cool project, specially as I am interested in both rust and stuff related to OS. I was going through the code and even though it seems quite understandable, it'd be great if there was a high level guide or even comments giving an outerview. I saw that the design and implementation section is TBD in readme, and even though I don't have much knowledge of container runtime, I would like to help writing docs or guide as it'd help me to understand it as well. Do you have anything specific in mind regarding such guide, or maybe can you open an issue regarding this, so we can discuss this there? Thanks!

YJDoc2 avatar May 17 '21 14:05 YJDoc2

@YJDoc2 Thanks for the comment! Currently, I have no idea about this yet. If you have any good ideas I'd love to hear them.

I saw that the design and implementation section is TBD in readme, and even though I don't have much knowledge of container runtime, I would like to help writing docs or guide as it'd help me to understand it as well.

I think it might be a good idea to start with commenting on the code.

utam0k avatar May 17 '21 14:05 utam0k

I think as implementation will grow, the documentation will become complicated, so rather than adding in Readme, having a dedicated 'thing' will be better. One way could be adding a repo wiki like this : https://github.com/dthain/basekernel/wiki Or other option I can think of is having an mdbook like rust language, and maybe hosting it on github pages for this repo.

I will try to start commenting and make a PR. Can you give me any links for references of this? Also, in case of any doubts I'll message in thread or on twitter if you're fine with it.

YJDoc2 avatar May 17 '21 15:05 YJDoc2

It's nice!

I will try to start commenting and make a PR.

You would use https://docs.rs, right?

Of course! I welcome questions from you on Twitter and elsewhere.

Also, in case of any doubts I'll message in thread or on twitter if you're fine with it.

utam0k avatar May 17 '21 16:05 utam0k

Hey,As far as I know, docs.rs automatically makes and hosts the html documentation using the doc comments in the source, when crates are uploaded on crates.io . I was talking about adding the in-source comments to explain structures and fields and functions etc, using doc comments according to conventions in https://doc.rust-lang.org/book/ch14-02-publishing-to-crates-io.html https://github.com/rust-lang/rfcs/blob/master/text/1574-more-api-documentation-conventions.md#appendix-a-full-conventions-text If you had something else in mind, let me know because I haven't worked specifically with docs.rs before, even though I have done documentation commenting for some of my projects.

YJDoc2 avatar May 17 '21 17:05 YJDoc2

@YJDoc2 I feel that docs.rs also automatically generates the descriptions for struct and functions, etc. https://docs.rs/futures/0.3.15/futures/io/struct.AllowStdIo.html This is the code that will generate the comments for this document https://docs.rs/futures-util/0.3.15/src/futures_util/io/allow_std.rs.html#43

At any rate, the current situation is that there is nothing to comment on, so adding a comment would be very meaningful and appreciated.

utam0k avatar May 18 '21 00:05 utam0k

I created https://github.com/utam0k/youki/issues/14

utam0k avatar May 19 '21 00:05 utam0k

I was thinking it might be possible to use async/await to concurrently handle cgroup controller configuration. This might be a pretty minor improvement to performance since these IO operations are relatively small, but might be worth looking into.

tsturzl avatar May 20 '21 23:05 tsturzl

@tsturzl That's a very good idea. I was actually wondering if I could use some of that. According to railcar's article, it seems that creating a cgroup waits for a kernel lock. I'm thinking that using async/await for cgroups might actually improve performance a bit. Are you interested in this?

utam0k avatar May 22 '21 01:05 utam0k

@utam0k I've been reading through some of the runc source while writing the memory controller. It seems like order of writes to certain files within a controller matters in some situations. I believe it might have to do with validation between 2 values in the kernel. See: https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/fs/memory.go#L89

I'd be happy to take look into it. Maybe once the cgroups controllers all implemented so I don't disrupt anyone's work. I think perhaps a good starting point might be to just ensure order of writes within each controller. That way each controller can be configured concurrently, but still ensure writes within a controller are happening in a certain order to avoid validation issues.

tsturzl avatar May 22 '21 04:05 tsturzl

@tsturzl Your ability to read the runc code so quickly is amazing! I'm still getting used to it and can only read what I need little by little. The use of async/await is a great advantage that youki has implemented in Rust and I would love to incorporate it. Can you create an issue about it? I'll assign it to you. And when it's done, I'd love to invite you to be a member of this repository and work with you.

utam0k avatar May 22 '21 06:05 utam0k

IPv6 support. See also #24

stappersg avatar May 24 '21 20:05 stappersg

Should Youki come up with some kind of central objectives or further out design goals? Perhaps it's too soon to tell, but some easier objectives could be just an emphasis on safety and speed. Eventually it might be nice to have some kind of goal beyond just having a workable runtime, since it seems like we're surprisingly close to having a totally functional runtime. Something to think about and discuss maybe?

tsturzl avatar May 25 '21 20:05 tsturzl

@tsturzl I think it's a great time to think about some big final goals. When I first started making it, I started with the same thoughts as those in the railcar release blog. So basically, I think there is a big point in implementing it in Rust, just as there is in the linux kernel. It's a big advantage that Rust can break through the linguistic difficulties of Go and C, which are the current typical implementations. https://blogs.oracle.com/developers/building-a-container-runtime-in-rust

Also, I would like to challenge performance. However, if you have any other good goals I would love to hear them. I think youki right now is a pretty good place to try it out as it is not yet in the practical stage. I think it is of course possible to offer it as a crate as well as a crun. If you have any interesting challenges, I'd love to hear about them.

utam0k avatar May 26 '21 00:05 utam0k

As one major policy, we would like to consider a style of preparing resources that can be prepared in advance, instead of preparing resources at the time of container creation. In particular, I would like to consider whether this can be done with cgroups. I would like to consider if it is possible to prepare a sub-command for pre-preparation or prepare some resources at the time of initial creation, and if there are pre-prepared resources, use them.

utam0k avatar May 26 '21 01:05 utam0k

@utam0k I saw this in the blog post for railcar, but I'm really curious how this is done. I wonder if updating a cgroup is less costly than creating a new one, so maybe always keeping a cgroup on standby ready to have the last portion of configuration done on the next container launch. Or perhaps some notion of caching some resources for reuse. It'll be interesting to profile some of these approaches.

tsturzl avatar May 26 '21 04:05 tsturzl

@tsturzl For example, how about starting to create cgroups asynchronously at the beginning of the create command as a first step?

utam0k avatar May 26 '21 04:05 utam0k

@utam0k it could be interesting to make all of youki's IO operation async, and then we could possibly kick off cgroups and the rest of the startup concurrently. It's possible with async runtimes like tokio to do m:n threading, where blocking operations can be pushed off on another thread in a thread pool. The question for something like that is if thread startup defeats the purpose entirely since the pool won't exist for long or see much reuse of threads, but I like the idea of not blocking any of the work whenever possible.

tsturzl avatar May 26 '21 05:05 tsturzl

@tsturzl For now, I think it would be better to apply async/await in the current processing order, and then separate the steps of creating and applying cgroups. I think the project in this issue will be an interesting development. If this can be achieved, I think it will be a great feature of youki. https://github.com/utam0k/youki/issues/17

utam0k avatar May 26 '21 05:05 utam0k

@tsturzl Hi! I found clone(3) system call. This is a very interesting feature for cgroups and I would love to use it. I'd like to hear your opinions and level of interest. https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.7-clone3-new-cgroup

utam0k avatar Jun 11 '21 11:06 utam0k

@utam0k This seems really useful! I'm under the impression that youki forks itself twice current, once to essentially create the namespace and another to act as the init process and handle some of the container startup. I'm not fully read up on how this is all done, but it seems like this could potentially save us from forking out the child process and only forking out the init process. Am I correct on this? If we were to implement this would we want to support older kernel versions and have the features selected at build time? Seems like with this and some of the things I've discussed with async file operations we are starting to look towards newer kernel features a lot. It almost begs the question of whether we should consider Youki on the bleeding edge and just require that you use a modern kernel to run it, or if we should build out support for old kernels in addition to some of these new features. It might make sense to focus on supporting the latest and greatest kernel features since it might take a while for Youki to see any kind of adoption, and by the time that happens it might be more common place that people are running the kernel versions that support Youki. Development efforts could be slowed by trying to keep things backwards compatible with old kernel versions.

tsturzl avatar Jun 11 '21 16:06 tsturzl

Frankly the more I look at supporting async file operations the more lib_uring seems to be the better option. Currently mio support epoll, however in epoll alone doesn't provide the feature set needed to do async file operations. So currently mio, and thus tokio, doesn't support any means of doing async file IO. Linux has actually 2 different AIO implementations one that is referred to as POSIX AIO which is apparently not well implemented, and then libAIO which is Linux specific. I believe the latter option supports some notion of passing a function pointer into the kernel for a callback. Currently a project implements this for mio: https://github.com/asomers/mio-aio This project is pretty small though, and the tokio community seems to be building up to supporting io-uring. They already have a low level crate for this, and have a proposal for using that as an optional API for async IO in the future.

So it seems like newer kernel features really seem useful to us. It would also put us out ahead of both runc and crun in terms of efficiency. Maybe this is a discussion worth having, what kernel versions do we want to support?

tsturzl avatar Jun 11 '21 16:06 tsturzl

Getting back to the topic though. We use either the nix crate to handle our forks, perhaps he can make a PR to the nix crate to support this there? The support already seems to be there in libc since libc is pretty much raw bindings to the system's libs.

tsturzl avatar Jun 11 '21 17:06 tsturzl

@tsturzl Fortunately, there is no demand yet to use youki with older kernels, so I want to support newer linux kernels as much as possible. Let's leave that to runc. How about targeting the widely used ubuntu20.04 as a standard here? That is the linux kernel 5.4. io_uring has been around since 5.1, so let's use it aggressively. clone3 is too up-to-date since it is from 5.7, but it seems to have a lot of advantages for youki. So I'd like to do something to support both the current fork style.

I believe that actively using the latest kernel features is one of the features of youki that could be written in the README.

I would like to make contributions to nix as much as possible.

utam0k avatar Jun 12 '21 00:06 utam0k

@Furisto, please let me know if you have any thoughts on this.

utam0k avatar Jun 12 '21 01:06 utam0k

@utam0k I was actually going to suggest the same! Just track the kernel version of the latest Ubuntu LTS. I think tracking kernel improvements would be useful and a good angle for youki.

tsturzl avatar Jun 12 '21 01:06 tsturzl

@utam0k I've been hacking around with lib-uring tonight, and while I think I have a pretty clear path forward here I think working on it while we're trying to get cgroups finished up is going to be contentious and result in a lot of conflicts that I'll probably spend more time than I'd like resolving. I think my effort now might be best spent trying to push the ball forward on cgroups v2. I think @Furisto did a great job laying the ground work, but it hasn't yet garnered a lot of attention from contributors yet.

tsturzl avatar Jun 12 '21 04:06 tsturzl

@tsturzl That's great! Talk to him about joint work and try it. However, I'm excited about this feature and can't wait to see it.

utam0k avatar Jun 12 '21 05:06 utam0k

@utam0k This seems really useful! I'm under the impression that youki forks itself twice current, once to essentially create the namespace and another to act as the init process and handle some of the container startup. I'm not fully read up on how this is all done, but it seems like this could potentially save us from forking out the child process and only forking out the init process. Am I correct on this?

You can safe one fork by creating the namespaces directly when cloning the init process. See clone(2). e.g CLONE_NEWPID. The corresponding flags are not exposed via fork. The advantage here is speed and just two processes instead of 3. I do the very same clone/fork thing here.

flxo avatar Jun 12 '21 12:06 flxo

@utam0k I agree that we should try to make use of the latest kernel features. People that are interested in using youki over more established alternatives are likely to be early adopters in general, so maintaining compatibility with older kernels is less of an issue for us.

Furisto avatar Jun 12 '21 22:06 Furisto