melange icon indicating copy to clipboard operation
melange copied to clipboard

Idea: unprivileged apk generation from static files

Open imjasonh opened this issue 1 year ago • 5 comments

Rough sketch, feedback welcome:

package:
  name: example
  version: v1.2.3
  epoch: 0
  description: this is an example

contents:
- file: /etc/config/blah.txt
  contents: |
    this is the contents of my file

- dir: /etc/config/empty-dir

- file: /etc/scripts/blah.sh
  permissions: 0x777

(This was inspired by apko's PathMutation, and probably could be even moreso aligned with it)

If a melange YAML contains only contents and no pipeline, it can be "executed" without privilege, since it can just emit the APK directly from the specified contents without invoking anything. If a config contains contents and pipeline, the contents are layered on after the pipeline executes (or we can just disallow this)

The code to take an fs.FS and put it into an APK is already heavily used, this would just be another way to call that code.

Disadvantages

This is probably only ergonomic to use for small config files. I wouldn't want to express an entire filesystem this way, and binary files are pretty much ruled out.

Users may be surprised by the interaction of YAML's whitespace handling and the output file, which might also have whitespace expectations. For example, if you want to specify a YAML config in nested YAML (🤢):

contents:
- file: example.yaml
  contents: |
    foo: bar
    bar:
      - baz

...how sure are you that this is going to be generated and parsed correctly?

Open Questions

  • should a contents that conflicts with the output of a pipeline fail, or overwrite?
  • should the contents -> APK functionality be a separate config / command / tool from melange build?
  • (probably not, but) does it make sense for this to be functionality of apko as well? A just-in-time APK generation feature could make for a good UX, though it undermines the provenance guarantees of apko, which probably makes it not worth doing.

If I get time I might try to hack up a prototype and see what sucks about it.

imjasonh avatar Mar 06 '23 17:03 imjasonh

I like this! Definitely helps with the use case I had trying to modify an nginx image to run on different port to work on Cloud run.

The ignition spec (https://coreos.github.io/ignition/configuration-v3_4/) has a bit of prior art if you did want to support something like adding a binary, though that wasn't the original use case.

Those look a little like:

files:
- path: /etc/nginx/nginx.conf
  contents:
    source: https://example.com/my-nginx.conf
    verification: sha256-aaabbddd..

Just supporting literals covers a bunch of use cases though 🙏🏻

nsmith5 avatar Mar 06 '23 17:03 nsmith5

Would this be part of melange build, or a separate applet?

kaniini avatar Mar 06 '23 18:03 kaniini

Would this be part of melange build, or a separate applet?

Good question! I don't know, and I'd love for someone to have a strong opinion so I don't have to 😆

If we imagine there being a use case for "execute these steps then add in these static files" then it probably belongs in melange build where the YAML has both pipeline: and contents:. If we don't think we want to support that, it gets harder to say.

I do think if it has a separate subcommand, I think it probably makes sense to have a separate config, otherwise there could be two ways to interpret the same YAML and that feels confusing.


As a possible example, we have 7 files in Wolfi that write files with cat+EOF:

The examples marked with * have subsequent steps that run after the "just write this file" step, which might depend on the contents of those files. This might indicate that we want some kind of uses: just-write-this-file that melange can interpret and execute without invoking anything or involving bwrap/docker. That feels like it gets messy though... 🤔

I might hack on this as a separate subcommand and config, and see if it makes sense to bring into melange build if it's successful. The use cases I've had for it are more oriented around making an APK containing some config for some other package to consume (e.g., nginx and nginx-config), and less about combining or interleaving static file generation into a build pipeline.

imjasonh avatar Mar 06 '23 21:03 imjasonh

I guess the question is: if it's not a separate command (I'm not sure it should be), how do we determine if the build process needs to run arbitrary code?

Is partial functionality (and erroring with a pre-flight check) acceptable and expected in a scenario where code execution is not possible?

I think I am fine with having an error like build requires a guest environment, no guest environment capability found when runs statements are encountered in the pipeline. We can scan the whole pipeline end to end as a preflight check.

But at the same time, it may be confusing that a user gets hit with this type of error because they do not have the ability to run code in a guest.

kaniini avatar Mar 06 '23 21:03 kaniini

What I would suggest is something like:

pipeline:
  - mkdir:
      path: ...
      permissions: 0o755
  - copy:
      from: ...
      to: ...
  - write:
      to: ...
      contents: |
        ...

Then if a runs operation is encountered, it could require the Run capability. If that capability is missing, then the pre-flight fails.

kaniini avatar Mar 06 '23 21:03 kaniini