build icon indicating copy to clipboard operation
build copied to clipboard

File removal in ACI

Open derekchiang opened this issue 10 years ago • 6 comments

This issue should perhaps be put on appc/spec, but I'm temporarily putting it here to reduce noise.

As discussed OOB, we were planning to use pathWhitelist to represent file removal. However, a whitelist is much more clumsy to use than a black list for removing files. For instance, if I simply want to remove a single file hello.txt, I would essentially have to add to the whitelist all files and directories that are not hello.txt. This process is not only error-prone, but could also result in a large manifest.

I propose two solutions:

  1. The simplest solution is simply to have a pathBlacklist. Intuitively, the blacklist would specify all files that we want to exclude from the final rendered image.

  2. We could also borrow ideas from overlayfs:

    whiteouts and opaque directories
    --------------------------------
    
    In order to support rm and rmdir without changing the lower
    filesystem, an overlay filesystem needs to record in the upper filesystem
    that files have been removed.  This is done using whiteouts and opaque
    directories (non-directories are always opaque).
    
    A whiteout is created as a character device with 0/0 device number.
    When a whiteout is found in the upper level of a merged directory, any
    matching name in the lower level is ignored, and the whiteout itself
    is also hidden.
    
    A directory is made opaque by setting the xattr "trusted.overlay.opaque"
    to "y".  Where the upper filesystem contains an opaque directory, any
    directory in the lower filesystem with the same name is ignored.
    

    So in this approach, an ACI that removes files would have the "whiteouts" and "opaque directories" described above in its rootfs. When rendered, it would remove the corresponding files in the lower layer.

Personally, I like the second approach, as it keeps all the "state" in rootfs, reducing the manifest to the simple metadata file that it was supposed to be. It's also worth noting that the two approaches can coexist.

What are your thoughts? @klizhentas @jonboulle @jzelinskie

derekchiang avatar Jul 03 '15 07:07 derekchiang

I guess the downside to the second approach is that it is coupled pretty tightly to overlayfs.

Are humans going to be reading these manifests by hand?

jzelinskie avatar Jul 03 '15 17:07 jzelinskie

Are humans going to be reading these manifests by hand?

I think that's something we already do :-) Can't speak for everyone, but I tend to read/write manifests by hand all time.

Whiteouts solution's advantage is that it's more scalable, imagine you've deleted lots of files, in this case manifest would be huge and unreadable, in addition to that tar probably will give a list of files anyway, so putting them in manifest is redundant.

klizhentas avatar Jul 03 '15 17:07 klizhentas

on the other hand, 'character device with 0/0' may be too OS-specific, there should be some additional metadata specifying the sentinel probably.

klizhentas avatar Jul 03 '15 17:07 klizhentas

@jzelinskie I think the second approach is just conceptually related to overlayfs; the actual implementation would not have to depend on overlayfs at all.

@klizhentas Agreed. But it's not just when you delete lots of files. Even if you delete only a single file foo, with only a whitelist, you'd have to specify all files that are not foo, which is just very counter-intuitive.

Glad that @klizhentas brought up the issue with the second approach being OS-specific. Now that I think about it, the second approach is also not backward-compatible, because some ACIs might already contain whiteouts and opaque directories and maybe the original authors actually want them to be there. The blacklist approach, on the other hand, is totally backward-compatible. I guess at this point I'm leaning more towards the first approach.

Would love to hear @jonboulle's opinion on this. Once we've settled on a solution, I'd be happy to move this issue to appc/spec and provide an implementation in rkt.

derekchiang avatar Jul 03 '15 22:07 derekchiang

As discussed OOB, we were planning to use pathWhitelist to represent file removal.

Hmm, I'm not sure I remember this discussion the same way - weren't we discussing squashing resultant images by default?

We have actually discussed this a lot in the past: tl;dr:

  • there's no real satisfactory sentinel approach, almost everything is either too OS or FS specific
  • whitelist vs. blacklist was essentially a coin flip since there are decent arguments and use cases both ways

We ended up deciding that we would be happy to make it an either/or for blacklist vs. whitelist, there's an issue in appc/spec discussing this: https://github.com/appc/spec/issues/323

Feel free to file a PR :-)

jonboulle avatar Jul 06 '15 22:07 jonboulle

great, seems that choosing blacklists/whitelists will work

klizhentas avatar Jul 06 '15 22:07 klizhentas