image-spec icon indicating copy to clipboard operation
image-spec copied to clipboard

Add ArgsEscaped field to image config

Open jterry75 opened this issue 3 years ago • 19 comments

This change officially adds ArgsEscaped to the image config. This field has already been used by Docker for several years, so adding it here allows images that depend on its behavior to work with other runtimes.

For certain Windows images created via Docker they may contain ArgsEscaped==true if the ENTRYPOINT or CMD is in the shell form and contains spaces, path characters, etc.

An example is the following:

FROM mcr.microsoft.com/windows/servercore:ltsc2019

SHELL ["powershell", "-Command"]
ENTRYPOINT "C:\Path To\MyExe.exe" -arg1 "-arg2 value2"

Will be encoded by Docker as:

"Config": {
  ...
  "ArgsEscaped": true,
  ...
  "Entrypoint": [
    "powershell.exe -Command \"C:\\My Data\\MyExe.exe\" -arg1 \"-arg2 value2\""
  ],
  ...
}

You can see that Entrypoint becomes a single element array with the entire argument list already escaped. To avoid a double escape problem Windows supports passing via OCI as a complete CommandLine.

@kevpar - Please let me know if you do not want me to carry this as your change or don't wan't your signature on the commit. I wanted to give you credit since you originally opened #829.

jterry75 avatar Jan 27 '22 18:01 jterry75

Ref: (https://github.com/containerd/containerd/pull/6479)

jterry75 avatar Jan 27 '22 18:01 jterry75

My thoughts most closely align with kevpar's comment:

I like the idea of adding ArgsEscaped to OCI and immediately marking it as deprecated, and also figuring out the right long term solution (such as CommandLine).

I'd like to see us document what this is when people see it in an image, without actually recommending that anyone do this. Perhaps change the phrasing to indicate it's here only for historical context.

sudo-bmitch avatar Feb 24 '22 19:02 sudo-bmitch

@jterry75 @kevpar what is the status with this?

katiewasnothere avatar Apr 14 '22 21:04 katiewasnothere

@katiewasnothere - I think its in limbo... Given the feedback we have I think I'll update the PR to signify exactly what was said. "ArgsEscaped" is a legacy compatibility with Docker and describe what it means. But mark it in the docs as deprecated and see if we can carry it from there. Sound good?

jterry75 avatar Apr 15 '22 22:04 jterry75

@jterry75 Sounds good! @kevpar, @dcantah, and I discussed this and we're fine with the change so long as it's marked deprecated, as you mentioned.

katiewasnothere avatar Apr 15 '22 22:04 katiewasnothere

Sweeeeeeeeeeet

jterry75 avatar Apr 15 '22 22:04 jterry75

@jterry75 This looks good to me. Maybe we could try to clarify the exact behavior of what ArgsEscaped implies in Moby, but as we discussed it is quite complicated (and we don't want anyone to take a dependency on it if they don't have to). So I think this is fine. Maybe we should just add a note like "Exact behavior of ArgsEscaped is complex and subject to implementation details in Moby project"?

kevpar avatar Jun 15 '22 17:06 kevpar

@jterry75 This looks good to me. Maybe we could try to clarify the exact behavior of what ArgsEscaped implies in Moby, but as we discussed it is quite complicated (and we don't want anyone to take a dependency on it if they don't have to). So I think this is fine. Maybe we should just add a note like "Exact behavior of ArgsEscaped is complex and subject to implementation details in Moby project"?

Do you want me to write that in the doc of the ArgsEscaped param or in the compat matrix part?

jterry75 avatar Jul 07 '22 18:07 jterry75

@jterry75 This looks good to me. Maybe we could try to clarify the exact behavior of what ArgsEscaped implies in Moby, but as we discussed it is quite complicated (and we don't want anyone to take a dependency on it if they don't have to). So I think this is fine. Maybe we should just add a note like "Exact behavior of ArgsEscaped is complex and subject to implementation details in Moby project"?

Do you want me to write that in the doc of the ArgsEscaped param or in the compat matrix part?

Probably the doc.

kevpar avatar Jul 07 '22 19:07 kevpar

@kevpar - Done. Ty

jterry75 avatar Aug 24 '22 22:08 jterry75

Thanks for updating! I don't have a Windows system available to build an image with this. Is there a public example (repository:tag) I can use to test against some code?

sudo-bmitch avatar Aug 25 '22 00:08 sudo-bmitch

Thanks for updating! I don't have a Windows system available to build an image with this. Is there a public example (repository:tag) I can use to test against some code?

I created a simple repro on nanoserver which should make use of the argsEscaped field at cplatpublic.azurecr.io/args-escaped-test-image-ns:latest

kiashok avatar Aug 26 '22 22:08 kiashok

Something I'm seeing in practice is the documentation here may be inaccurate. It looks like Docker also sets the field on Linux images:

$ regctl image config regclient/regbot:edge-alpine --platform linux/amd64 --format body | jq .
{    
  "architecture": "amd64",
  "config": {
      "User": "appuser",
      "Env": [                                                                                             
      "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
      "LUA_PATH=?;?.lua;/lua-user/?;/lua-user/?.lua;/lua-mods/?;/lua-mods/?.lua"
    ],               
    "Cmd": [     
      "regbot", 
      "--help"  
    ],  
    "Labels": {
      "maintainer": "",                                                                                  
      "org.opencontainers.image.authors": "Regclient contributors",             
      "org.opencontainers.image.created": "2022-08-28T06:11:35Z",               
      "org.opencontainers.image.description": "",    
      "org.opencontainers.image.documentation": "https://github.com/regclient/regclient",
      "org.opencontainers.image.licenses": "Apache 2.0",
      "org.opencontainers.image.revision": "38302412a40afc945a0dd175952ed6d1935709f9",
      "org.opencontainers.image.source": "git://github.com/regclient/regclient.git",
      "org.opencontainers.image.title": "regbot",  
      "org.opencontainers.image.url": "https://github.com/regclient/regclient",
      "org.opencontainers.image.vendor": "",
      "org.opencontainers.image.version": "edge"
    },
    "ArgsEscaped": true,
    "OnBuild": null
  },
...
  "moby.buildkit.buildinfo.v1": "eyJmcm9udGVuZCI6ImRvY2tlcmZpbGUudjAiLCJzb3VyY2VzIjpbeyJ0eXBlIjoiZG9ja2VyLWltYWdlIiwicmVmIjoiZG9ja2VyLmlvL2xpYnJhcnkvYWxwaW5lOjMiLCJwaW4iOiJzaGEyNTY6YmM0MTE4MmQ3ZWY1ZmZjNTNhNDBiMD
Q0ZTcyNTE5M2JjMTAxNDJhMTI0M2YzOTVlZTg1MmE4ZDk3MzBmYzJhZCJ9LHsidHlwZSI6ImRvY2tlci1pbWFnZSIsInJlZiI6ImRvY2tlci5pby9saWJyYXJ5L2dvbGFuZzoxLjE5LWFscGluZSIsInBpbiI6InNoYTI1NjowZWIwOGM4OWFiMWIwYzYzOGE5ZmUyNzgwZjdhZTNhY
jE4ZjZlY2RhMmM3NmI5MDhlMDllYjgwNzM5MTIwNDVkIn1dfQ==",
  "os": "linux",
  "rootfs": {
    "type": "layers",
    "diff_ids": [
      "sha256:994393dc58e7931862558d06e46aa2bb17487044f670f310dffe1d24e4d1eec7",
      "sha256:513eada741f45b3bcd69b78fe48a47e33d58902ca222d6e7207778c6e5e38f95",
      "sha256:d32bd2684453d14075af8b21a87dbb1f59f847e43176f057a843062c79795315",
      "sha256:32e78f316d612f113c5c80ee9043a9d9fcbe2c184ac3c781ce08789c998ee1f8",
      "sha256:37793838d5bc565705ecddbca70db892a3a47b254c2e999f63f5a5d55673221f",
      "sha256:1f87be5261102c7967c22a17b5738955db1112fbe8a3d81b17aabc61f51b71c4",
      "sha256:83a4d07b78157fd904c74184915a052a4a6f5b4e76e31f60ffab2f9449497bcb",
      "sha256:8ab9f5ef641b369f97a3ef977e67580f5076b21fe362372d37b0f93b8bb9e3c8"
    ]
  }
}

sudo-bmitch avatar Aug 28 '22 20:08 sudo-bmitch

@kevpar / @dmcgowan - Any ideas? We certainly never handled ArgsEscaped for Linux case. Is there a type of Linux image that we need to be concerned with here? Also I dont see that it did anything in the example above. Yes its true but Cmd is not escaped. Seems like a image build bug honestly but I dont know enough about the Linux flow here.

jterry75 avatar Aug 29 '22 18:08 jterry75

This is interesting. I'm not sure how ArgsEscaped could be set on a Linux image. Looking at the moby code, it appears the ArgsEscaped value should come from resolveCmdLine, which has differing implementations for unix and windows. You can see that the unix implementation never returns true for the second return value, which is what gets used for ArgsEscaped. I wonder if there is older or other tooling that could set this field inadvertently.

As a test, I tried building a simple image on Linux with the following Dockerfile:

FROM ubuntu:latest
ENTRYPOINT "echo foo"

This produced an image with ArgsEscaped not set. However, a comparable image built on Windows did have it set. This seems in line with expectations that this value is not normally set for Linux images.

Regardless of if the value could be set on a Linux image, from what I can see the only place ArgsEscaped is used to determine behavior at runtime is here, which is a Windows-only file. So I think it is safe to treat ArgsEscaped as Windows-only.

kevpar avatar Sep 20 '22 22:09 kevpar

The images I'm building are from buildkit (and buildx), where it appears to be set on every image that defines a CMD:

https://github.com/moby/buildkit/blob/cdc17fe2beb46178e34eecd1368740b004ad6592/frontend/dockerfile/dockerfile2llb/convert.go#L1266

I don't think buildkit has support for windows yet (and definitely didn't when this was added). Pinging @tonistiigi in case there's any other use cases we should keep in mind.

sudo-bmitch avatar Sep 21 '22 00:09 sudo-bmitch

Don't spend any effort debugging the CI failures, they should be fixed with a rebase.

sudo-bmitch avatar Sep 21 '22 00:09 sudo-bmitch

@sudo-bmitch You can create windows images in buildkit, if not via wcow then with cross-compilation. If the analysis shows that no runtime uses it on linux we can drop it based on the platform check.

tonistiigi avatar Sep 21 '22 00:09 tonistiigi

@sudo-bmitch You can create windows images in buildkit, if not via wcow then with cross-compilation. If the analysis shows that no runtime uses it on linux we can drop it based on the platform check.

@tonistiigi my bad, I was confusing with the native windows build that was holding buildkit back from being the default builder for so long, but that doesn't apply here. As long as you don't see a need for it on Linux, I'm good with getting this merged. We're not so concerned with changing buildkit as we are that we're getting the spec right, and setting this on Linux is undefined behavior that no one should care about.

@jterry75 if you cleanup the nit and rebase, I'll add my LGTM and see if we can find a second maintainer to get this merged.

sudo-bmitch avatar Sep 21 '22 00:09 sudo-bmitch

Sorry folks didnt realize there was a NIT here. Done!

jterry75 avatar Oct 05 '22 17:10 jterry75

In the context of my making https://github.com/moby/buildkit/pull/4723, I'm actually wondering why this is marked as deprecated? Windows does not use argc/argv (that behavior is implemented per-program) as the CreateProcessA system call over there actually takes a single string (https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessa) and that's what binaries receive. IIRC there's a common "libc" implementation that parses and splits that per-application, but some like cmd.exe have legitimate reasons for not splitting/parsing that in the normal way.

All this to say, single-string CommandLine is actually how Windows works (and that doesn't seem likely to change any time soon), so it feels wrong for the spec to only acknowledge that in a deprecated field? If I were designing this today, I'd probably make it a special case inside Cmd or something intead (["WINDOWS-PRE-ESCAPED-COMMANDLINE", "cmd /C /S \"echo foo bar \""]), but since that ship has sailed, it feels a bit too reductive to document this as deprecated and discourage its use and I must be missing something.

tianon avatar Mar 01 '24 19:03 tianon

@tianon it was marked deprecated primarily because the exact way ArgsEscaped is interpreted by Docker has some weird edge cases so it is hard to standardize. IIRC, an example of this is that if ArgsEscaped is set on the image, even the args specified at the container level will be interpreted as a single string.

I agree that there should be proper recognition for Windows command line strings in the image config. My preference would probably be some new fields like EntrypointString and CmdString.

kevpar avatar Mar 01 '24 19:03 kevpar

I have to admit I don't really understand the "logic is complicated" argument :sweat_smile:

The logic for setting it correctly during build sure is, but the runtime logic is pretty straightforward:

args := append(append([]string{}, Entrypoint...), Cmd...)
escaped := []string{}
if ArgsEscaped {
	escaped = append(escaped, args[0])
	args = args[1:]
}
for _, a := range args {
	escaped = append(escaped, golang.org/x/sys/windows.EscapeArg(a)
}
CommandLine := strings.Join(escaped, " ")

(yes, most of this could be optimized with make(), etc since the slice sizes are all known in advance, but this gets the general point across -- combine entrypoint+cmd, escape everything except the first argument if argsescaped)

tianon avatar Mar 01 '24 19:03 tianon

From what I can tell in https://github.com/containerd/containerd/pull/6479#discussion_r813320420, it's only complicated because there's a wrapper in use that turns Args into CommandLine, but has to also accept CommandLine as-is (as evidenced by that code even working), so it'd be much more straightforward if it didn't rely on the wrapper to set CommandLine from Args (and did the escaping directly in that block in both cases :sweat_smile:)

tianon avatar Mar 01 '24 19:03 tianon