nerdctl icon indicating copy to clipboard operation
nerdctl copied to clipboard

Add Windows Hostprocess flag

Open jsturtevant opened this issue 3 years ago • 5 comments

Fixes: #518

Still needs:

  • [x] docs
  • [x] a bit more thought on how to specific user
  • [x] test

jsturtevant avatar Feb 08 '22 18:02 jsturtevant

The test failed with

   Command:  C:\Windows\system32\config\systemprofile\go\bin\nerdctl.exe --namespace=nerdctl-test run --rm --host-process -u "NT AUTHORITY\Local Service" gcr.io/k8s-staging-e2e-test-images/busybox:1.29-2 whoami
        ExitCode: 1
        Error:    exit status 1
        Stdout:   
        Stderr:   time="2022-02-10T18:18:12Z" level=fatal msg="failed to create user process token: failed to logon user: The user name or password is incorrect.: unknown"
        Failures:

I am guessing the containerd process isn't running with high enough privileges in the CI environment as this passes locally with containerd installed with a similar script. @dcantah do you know if there is another built in user we could use here?

jsturtevant avatar Feb 11 '22 16:02 jsturtevant

needs rebase

AkihiroSuda avatar Feb 17 '22 03:02 AkihiroSuda

Added a logging statement for the who the hostprocess container to see what it thinks it is running as and it is coming back as

run_windows_test.go:35: whoami nt authority\system

@dcantah I would think that since it is system it should be able to start Local Service?

jsturtevant avatar Feb 18 '22 16:02 jsturtevant

Added a logging statement for the who the hostprocess container to see what it thinks it is running as and it is coming back as

run_windows_test.go:35: whoami nt authority\system

@dcantah I would think that since it is system it should be able to start Local Service?

Yea this is really odd.. The only thing I can think of is there's something blocking logging in users in this environment. The cri-integration tests in Containerd have tests that login as different service accounts https://github.com/containerd/containerd/blob/5162238c7da04d6bdd771d0aa43f8e34c91ea2ca/integration/windows_hostprocess_test.go#L38

dcantah avatar Feb 18 '22 19:02 dcantah

What's current status?

AkihiroSuda avatar Jul 07 '22 05:07 AkihiroSuda

I refreshed the PR. It is still failing when running in CI though it works locally, I thought it might be a Cirrus vs github actions thing but doesn't look like it.

I will need to dig deeper, as it is working locally when I test it.

jsturtevant avatar Nov 10 '22 23:11 jsturtevant

debugging a different problem and got some insight from @marosset, It might be because I am not passing the user name info through on the config. going to try that and see I can get this working. better late than never 🙈

jsturtevant avatar Jan 05 '23 18:01 jsturtevant

debugging a different problem and got some insight from @marosset, It might be because I am not passing the user name info through on the config. going to try that and see I can get this working. better late than never 🙈

I think that will fix the issue

marosset avatar Jan 05 '23 18:01 marosset

Nice. We didn't allow a blank username for CRI either, right? That's probably the right way.. defaulting to SYSTEM is kinda scary

dcantah avatar Jan 05 '23 19:01 dcantah

Hmm.. if the username string is empty we should error a bit earlier https://github.com/microsoft/hcsshim/blob/6547959343d65e0cf431dd9d589f45ee2b98172c/internal/jobcontainers/logon.go#L109. Maybe it's filling in container-user/container-admin from the image? 🤷‍♂️

dcantah avatar Jan 05 '23 19:01 dcantah

That wasn't the case, It is being passed by the main nerdctl options.

I compared the oci spec to one I have working via the cri server and nothing jumps out. Here is the oci spec that is being generated with nerdctl:

{
    "ociVersion": "1.0.2-dev",
    "process": {
        "user": {
            "uid": 0,
            "gid": 0,
            "username": "nt authority\\local system"
        },
        "args": [
            "cmd.exe",
            "/s",
            "/c",
            "cmd"
        ],
        "env": [
            "PATH=C:\\dig\\;C:\\bin;C:\\curl;C:\\Windows\\System32;C:\\Windows;C:\\Program Files\\PowerShell;",
            "ProgramFiles=C:\\Program Files",
            "LOCALAPPDATA=C:\\Users\\ContainerAdministrator\\AppData\\Local",
            "PSModuleAnalysisCachePath=C:\\Users\\ContainerAdministrator\\AppData\\Local\\Microsoft\\Windows\\PowerShell\\docker\\ModuleAnalysisCache",
            "PSCORE=C:\\Program Files\\PowerShell\\pwsh.exe"
        ],
        "cwd": ""
    },
    "root": {
        "path": ""
    },
    "hostname": "1624c258dfdc",
    "hooks": {
        "createRuntime": [
            .... cut this out, hooks are supported in windows so shouldn't have any affect
        ]
    },
    "annotations": {
        "io.containerd.image.config.stop-signal": "SIGTERM",
        "microsoft.com/hostprocess-container": "true",
        "nerdctl/extraHosts": "null",
        "nerdctl/hostname": "1624c258dfdc",
        "nerdctl/name": "busybox-1624c",
        "nerdctl/namespace": "default",
        "nerdctl/networks": "[\"nat\"]",
        "nerdctl/platform": "windows/amd64",
        "nerdctl/state-dir": "C:\\ProgramData\\nerdctl\\052055e3\\containers\\default\\1624c258dfdc478e27fdd3f4cb6a0874486f7e1abd2cef426aaeec2c97b908c2"
    },
    "windows": {
        "layerFolders": null,
        "ignoreFlushesDuringBoot": true,
        "network": {
            "allowUnqualifiedDNSQuery": true
        }
    }
}

and the spec generated with the cri call:

{
    "ociVersion": "1.0.2-dev",
    "process": {
        "user": {
            "uid": 0,
            "gid": 0,
            "username": "NT AUTHORITY\\SYSTEM"
        },
        "args": [
            "ping",
            "-t",
            "localhost"
        ],
        "env": [
            "PATH=C:\\Windows\\system32;C:\\Windows;"
        ],
        "cwd": "C:\\"
    },
    "annotations": {
        "io.kubernetes.cri.container-name": "busybox",
        "io.kubernetes.cri.container-type": "container",
        "io.kubernetes.cri.image-name": "k8s.gcr.io/pause@sha256:85cfebc79dccc7a0e56680778a614b29a0b1c2ae98d4b1efc746764c04d5656c",
        "io.kubernetes.cri.sandbox-id": "0164ef1c9d9a72e904fe45f12e5698e06338ec913798798f2030773ffa0b1b09",
        "io.kubernetes.cri.sandbox-name": "sandbox-hp",
        "io.kubernetes.cri.sandbox-namespace": "default",
        "io.kubernetes.cri.sandbox-uid": "alskasdfasdf",
        "microsoft.com/hostprocess-container": "true"
    },
    "windows": {
        "layerFolders": null,
        "network": {}
    }
}

jsturtevant avatar Jan 07 '23 00:01 jsturtevant

Can we make a dummy test that prints out the current user that the CI is running as? It feels like a permission issue

dcantah avatar Jan 07 '23 00:01 dcantah

Already did https://github.com/containerd/nerdctl/pull/795/commits/f3af71a5937eb9cadf36799538fc9a9318955bb7 but thought it doesn't seem to have output what I thought it should. Let me try something else.

configuration complete! Printing configuration... Service:

Status : Running Name : containerd DisplayName : containerd

Id : 6212 Handles : 148 CPU : 0.140625 SI : 0 Name : containerd

jsturtevant avatar Jan 07 '23 00:01 jsturtevant

btw, the test that uses hostprocess containers without the user specified passes in CI

jsturtevant avatar Jan 07 '23 00:01 jsturtevant

I am also able to reproduce this locally where my process is running as a service under nt authority\system

jsturtevant avatar Jan 07 '23 00:01 jsturtevant

I added a bunch of tests with different inputs, some of them passed. It appears to be the way the paramaters are passed to the binary in the test framework. I will clean it up and I think we will have a few tests passing 🤞

jsturtevant avatar Jan 07 '23 01:01 jsturtevant

The windows tests are passing. Ends up the NT AUTHORITY must be upper case and I had some extra quotes when they were not needed in the test framework. I will clean up the commits and we should be good for a review.

jsturtevant avatar Jan 09 '23 18:01 jsturtevant

@AkihiroSuda this is finally ready for review. I did move the Windows CI from Cirrus to Github actions during my debugging, I can either drop that commit or move it to a separate PR but I think I prefer GitHub UI for debugging and it aligns better with rest of the tests.

jsturtevant avatar Jan 09 '23 21:01 jsturtevant

If Cirrus supports nested virt it might be worth staying on it, otherwise we won't be able to test hyper-v isolation on GHA (unless they made progress on offering this..)

dcantah avatar Jan 09 '23 21:01 dcantah

I had a couple of suggestions for the usage but overall LGTM Thanks @jsturtevant

marosset avatar Jan 09 '23 23:01 marosset

If Cirrus supports nested virt it might be worth staying on it, otherwise we won't be able to test hyper-v isolation on GHA (unless they made progress on offering this..)

great point, It does look like the compute_engine_instance (https://cirrus-ci.org/guide/custom-vms/) does support nested_virtualization: nested_virtualization: true # optional. Whether to enable Intel VT-x. Defaults to false.

I will drop this commit.

jsturtevant avatar Jan 10 '23 17:01 jsturtevant

It does look like the compute_engine_instance (https://cirrus-ci.org/guide/custom-vms/) does support nested_virtualization: nested_virtualization: true # optional. Whether to enable Intel VT-x. Defaults to false.

Sweet!

dcantah avatar Jan 11 '23 00:01 dcantah