railcar icon indicating copy to clipboard operation
railcar copied to clipboard

Library

Open crackcomm opened this issue 6 years ago • 18 comments

Can OCI containers be spawned using railcar as a rust library?

crackcomm avatar Jul 03 '17 19:07 crackcomm

no it isn't configured to be built as a library unfortunately. It could be converted to work that way.

vishvananda avatar Jul 04 '17 03:07 vishvananda

If you want to go this way @vishvananda (which I think would be awesome) I would recommend against going the libcontainer model.

After talking to some people who wanted to use runc as a library, the best way to match everyone's usecases is that you make the API just takes the config.json structure -- and you remove the spec restrictions for root.path (which means they can just put whatever rootfs path they like without writing the file to disk).

Don't make the same mistakes as us, only expose OCI APIs. :wink:

cyphar avatar Jul 06 '17 04:07 cyphar

I agree. The conversion from oci spec to libcontainer in runc is ugly :)

vishvananda avatar Jul 06 '17 04:07 vishvananda

Yup, but the root.path thing is more important than you might think on first reading. The root.path restrictions (and the whole "bundle" concept) make the API far more stateful than most people would like.

cyphar avatar Jul 06 '17 04:07 cyphar

could you explain what the root.path issue is @cyphar ?

just coming in here from a different angle. We can't want to write anything to disk, so is it possible to feed the config.json through stdin?

aep avatar Jul 18 '17 11:07 aep

I am not totally sure what the issue is here but I would like to point out another issue in ACI runtimes. One specifically I like is dgr but collecting files under root.path is adding up to 3 additional seconds to 300 milliseconds average startup time.

Another issue I found with container runtimes like runc was mounting images from /ipfs file system. Important issue here is file permissions but it is indeed an interesting concept.

I do not have enough experience with file systems but I am sure it is possible to create copy-on-write file system for the container with minimal overhead with minimal amounts of code. This would be solution for all.

(this comment is off-topic)

crackcomm avatar Jul 18 '17 12:07 crackcomm

@crackcomm i'm not sure i fully understand what you're saying. you're looking into using ipfs for distribution? Does this thing have fuse? (that's what we do). Or just bind mount the hosts ipfs dir to the container root.

aep avatar Jul 18 '17 12:07 aep

The issue was file permissions and that's why it didn't work.

I am not sure the place of this additional layer is railcar. IPFS itself doesn't have much to do with mounted files permissions, all are hardcoded to one, as far as I know.

Issue is about creating the trees, file systems for the containers, copy-on-write with permissions overwrite.

(this comment is off-topic)

crackcomm avatar Jul 18 '17 12:07 crackcomm

@crackcomm ah, and that's why you want to use railcar as lib instead of calling it from your permission-overwrite thing? Not sure if it makes much of a difference.

have a look at systemd/casync and ostree btw. they're the right communities to discuss image distribution problems with i think.

aep avatar Jul 18 '17 12:07 aep

@aep it just came up by discussion, thanks about these two.

crackcomm avatar Jul 18 '17 12:07 crackcomm

@aep The root.path issue is that there is a restriction in the OCI runtime-spec which states that root.path must be inside the "bundle" which contains the configuration file. Which means you have to do a mkdir, write the config, and then do a bindmount to root.path. These restrictions make it harder to use an OCI runtime as a library. We hit this issue in runc, which is why I wrote my recommendation.

@crackcomm I would also recommend looking at the OCI image-spec. Currently it doesn't define distribution, but this is an issue that I am going to be working on. While @aep's mention of ostree and casync is cool, I feel that those implementations are not going to be very useful if we intend to standardise the image storage and distribution schemas. It's quite disappointing that the systemd folks completely ignored the existence of the OCI when creating their own NIH distribution/storage scheme.

cyphar avatar Jul 18 '17 12:07 cyphar

@cyphar thank's for the explanation. Not sure if its intended, but railcar lets me specify a path outside the bundle for root, which will be on something like cafs for us. Now all i'd need is feed the config through stdin and store the state somewhere else. Although i just realized we might just make out lifes easier by moving the bundle dir to tmpfs.

does oci image-spec actually specify distribution? that would be odd, since storage and content distribution is a complex and diverse topic. We for example are literally running on toasters. There's no way we could do something like actually unpacking to a disk.

aep avatar Jul 18 '17 12:07 aep

Not sure if its intended, but railcar lets me specify a path outside the bundle for root, which will be on something like cafs for us.

Heh, that's not really in keeping with the spec, which defines root.path to be a path inside the bundle. Though you could argue it's an extension and thus not breaking spec compatibility.

does oci image-spec actually specify distribution?

Not yet. I'm working on an extension to do so (https://github.com/cyphar/parcel) based on ACI, because it's a very large problem in interoperability of image-spec implementations.

that would be odd, since storage and content distribution is a complex and diverse topic.

Yup, which is why my proposal just adds a way of fetching relevant metadata to get blob URIs that you can download using more traditional means. I'm not in the business of redefining HTTP/FTP/BitTorrent. :wink:

We for example are literally running on toasters. There's no way we could do something like actually unpacking to a disk.

That sounds quite cool. But if you're going to use an NFS-like setup (which is what it sounds like you're describing) there's no reason that you cannot have a mount for an image which is fetching the blobs as necessary. At the moment, blobs are huge layers that are inefficient, but in the future we are looking at using a more IPFS-like blob structure where you have versioned individual files (or rather chunks of files). At that point the distribution aspect becomes more like IPFS for your usecase, but looks like normal HTTP for everyone else.

At least, that's what I (and some other people in OCI) are currently thinking about.

cyphar avatar Jul 18 '17 13:07 cyphar

Mounting IPFS images is impossible because permissions were discarded as part of unixfs implementation. I think it can be solved pretty easily. Up to this day I can not see better way to host images even though my attempts to run them have failed and uploading is still pretty slow compared to Docker.

crackcomm avatar Jul 18 '17 17:07 crackcomm

But if you're going to use an NFS-like setup

Fuse mounting a compressed in memory store. I guess you could call that NFS-like in some way.

or rather chunks of files

cool. anything tangible yet i could look at? Efficient storage is a huge deal for us. Wouldn't mind it being standard, but not sure if the absurd amount of efficiency we need is worth bothering for anyone else.

aep avatar Jul 18 '17 20:07 aep

@aep One of the maintainers has been working on something like this, but there's still a lot of discussion that needs to happen in the spec (right now that design is entirely about the storage and not about distribution, but that change will make IPFS-like or NFS-like fetching more useful than linear archives).

As for efficiency, trust me, the efficiency improvements you get from ditching linear archives in the storage schema are absolutely insane. The only reason they are present in the current image-spec is for backwards-compatibility with Docker's images (so that they can be losslessly translated).

cyphar avatar Jul 19 '17 00:07 cyphar

@crackcomm When referring to IPFS I was talking about the way the lazy fetching works, not the current implementation of IPFS. We don't plan to actually use IPFS for images (but I hope that in future you could use IPFS as the distribution scheme for images).

cyphar avatar Jul 19 '17 00:07 cyphar

@cyphar sent you an email to continue discussing that.

aep avatar Jul 20 '17 20:07 aep