buildkit
buildkit copied to clipboard
Unable to use Buildkit with Windows containers
I'm using the Buildkit version that comes bundled with Docker for Windows 18.06.1 and am experiencing some trouble running it with Windows containers. In the log below you can see a build succeed for a very simple build running without Buildkit and then failing once I enable it. The localized error message "Det går inte att hitta filen" roughly translates to "Unable to find the file". I've had success running Buildkit on the same system when running Linux containers. A minimal project that reproduces the error can be found here test.zip.
PS C:\test> docker version
Client:
Version: 18.06.1-ce
API version: 1.38
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:21:34 2018
OS/Arch: windows/amd64
Experimental: false
Server:
Engine:
Version: 18.06.1-ce
API version: 1.38 (minimum version 1.24)
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:36:40 2018
OS/Arch: windows/amd64
Experimental: true
PS C:\test> ls
Directory: C:\test
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 2018-09-11 15:38 74 Dockerfile
-a---- 2018-09-11 15:39 23 test.txt
PS C:\test> type .\Dockerfile
FROM microsoft/nanoserver:1803
COPY test.txt /test.txt
RUN type test.txt
PS C:\test> $Env:DOCKER_BUILDKIT=0
PS C:\test> docker build -t test .
Sending build context to Docker daemon 3.072kB
Step 1/3 : FROM microsoft/nanoserver:1803
---> 693ff1719e39
Step 2/3 : COPY test.txt /test.txt
---> 3cb8bc9e5e2e
Step 3/3 : RUN type test.txt
---> Running in 376f873629fd
This is a test message!Removing intermediate container 376f873629fd
---> 0cce47564a2d
Successfully built 0cce47564a2d
Successfully tagged test:latest
PS C:\test> $Env:DOCKER_BUILDKIT=1
PS C:\test> docker build -t test .
[+] Building 0.2s (2/2) FINISHED
=> local://dockerfile (Dockerfile) 0.1s
=> => transferring dockerfile: 31B 0.0s
=> local://context (.dockerignore) 0.1s
=> => transferring context: 2B 0.0s
failed to read dockerfile: open C:\ProgramData\Docker\tmp\buildkit-mount977689469\Dockerfile: Det går inte att hitta filen.
Buildkit is not supported for Windows containers in docker 18.06
/18.09
Any plans to support it?
If there is no windows container support yet, I think the error message need to be update to define expectation.
@quangkieu it looks to be described on documentation: https://docs.docker.com/build/buildkit/#getting-started
Only supported for building Linux containers
@olljanat I meant about the error message from the built process.
When is buildkit support coming for windows?
Maybe a better question is what needs to be done/what are the outstanding dependencies?
Has anyone tried using buildctl
on Windows via instructions at https://github.com/moby/buildkit#exploring-dockerfiles with buildkit
daemon running in a container? Looks like that might be an alternative until docker build
works properly on Windows?
@Iristyle if you read that doc more carefully it also says
the buildkitd daemon is only available for Linux currently.
@Barsonax I'm bit worry about that we will not see Windows containers support ever because there is no Microsoft persons contributin to this project. Hopefully I'm wrong.
@olljanat well, I'm using LCOW, which hosts a real Linux kernel - so it's a bit of a grey area (and a lot of the docker folks don't seem to know much about in practical terms). I played around a little and I was getting closer to having rootless running per instructions at https://github.com/moby/buildkit/blob/master/docs/rootless.md#about---oci-worker-no-process-sandbox, noting that --privileged
is not supported on Windows at all.
I'll update if I'm able to get it going or hit a dead end.
@Iristyle that is probably possible but this issue is about real Windows containers so let's try keep on topic.
Since last time I looked into this, containerd gained support for Windows 10 1809/Windows Server 2019, so it's possible no MS involvement in buildkit is needed, if it can get everything it needs for the low-level part via its containerd backend.
Edit: A quick look at the build system for buildkit suggests that you need running buildkit (either locally, or running inside Docker) to build buildkit. I'm somewhat flummoxed by this.
@TBBle hmm. Yea here is some info about containerd support on https://docs.microsoft.com/en-us/virtualization/windowscontainers/deploy-containers/containerd so maybe it can be possible.
Then someone probably can try build buildkitd.exe for Windows to see where it fails. I also guess that latest Docker binaries with containerd support are needed ( more info about that https://github.com/moby/moby/pull/38541 )
Ah, thank you. moby/moby#38541 is the PR reference I was looking for earlier.
Poking through, containerd doesn't seem to publish Windows binaries in their releases despite having thew new Windows V2 runtime in their 1.3.0 release, and their AppVeyor build pipeline doesn't capture artifacts.
The required hcsshim project does publish artifacts from their AppVeyor pipeline, even though they don't include them in their releases.
Both have recent-enough releases to meet the criteria laid out in moby/moby#38541 but they both also have active work on master which might make a difference.
containerd currently vendors a specific commit of hcsshim (Microsoft/hcsshim@d2849cbdb9dfe5f513292a9610ca2eb734cdd1e7), binaries for which can be fetched from AppVeyor. For containerd 1.3.2 (Microsoft/hcsshim@9e921883ac929bbe515b39793ece99ce3a9d7706) the binaries are also on AppVeyor but will expire in late February. Both of these vendored versions are older than the current hcsshim release, 0.8.7, whose artifacts are also on AppVeyor.
In the end, it's not clear to me if this ecosystem is yet in a state to start trying to get BuildKit working, and containerd/containerd#1920 (which has not been updated since the switch to the Windows V2 API) gives me a reasonable level of doubt.
Quick correction: Containerd does have nightly builds for Windows, they're at https://github.com/containerd/containerd/actions?query=workflow%3ANightly
So with a bit of hacking I got containerd working on my Windows 10 Desktop (mostly blocked by a bug recently introduced into containerd master Edit: Fix pending in containerd/containerd#3929).
I then did a bunch more hacking on BuildKit, including fixing a couple of bugs, and commenting out a lot of stuff.
Buildkitd ran, and tried to build me a package, but failed because it didn't copy the Dockerfile over.
PS C:\Users\paulh\Documents\BuildKit\simpleDocker> buildctl.exe --debug build --frontend=dockerfile.v0 --local context=. --local dockerfile=.
[+] Building 0.0s (0/0)
time="2020-01-05T07:47:33+11:00" level=debug msg="serving grpc connection"
[+] Building 0.1s (2/2) FINISHED
=> [internal] load build definition from Dockerfile 0.1s
=>
=> transferring dockerfile: 983B 0.0s
=> [internal] load .dockerignore 0.1s
=>
=> transferring context: 2B 0.0s
error: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to read dockerfile: open C:\Users\paulh\AppData\Local\Temp\buildkit-mount017874163\Dockerfile: The system cannot find the file specified.
failed to solve
github.com/moby/buildkit/client.(*Client).solve.func2
C:/Users/paulh/go/src/github.com/moby/buildkit/client/solve.go:203
github.com/moby/buildkit/vendor/golang.org/x/sync/errgroup.(*Group).Go.func1
C:/Users/paulh/go/src/github.com/moby/buildkit/vendor/golang.org/x/sync/errgroup/errgroup.go:57
runtime.goexit
c:/go/src/runtime/asm_amd64.s:1357
I assume this is because I commented out too much, and somehow excluded the code that actually copies things into the snapshots, as both created snapshots were empty despite reporting having transferred stuff. The DockerFile itself did no transfers from the host OS, it's [MS's trivial Python example](# https://github.com/MicrosoftDocs/Virtualization-Documentation/blob/master/windows-container-samples/python/Dockerfile).
PS C:\Users\paulh\Documents\BuildKit\simpleDocker> buildctl.exe --debug du
ID RECLAIMABLE SIZE LAST ACCESSED
x86vuhy70whikjae56p5wsfmo* true 0B
m733jropkh4azwwgoknhowicq* true 0B
Reclaimable: 0B
Total: 0B
PS C:\Users\paulh\Documents\BuildKit\simpleDocker> buildctl.exe --debug prune
ID RECLAIMABLE SIZE LAST ACCESSED
m733jropkh4azwwgoknhowicq* true 0B
x86vuhy70whikjae56p5wsfmo* true 0B
Total: 0B
With #1314, and some more hacking on things, I've gotten to the point where my next failure is coming from inside containerd, or the connection to it.
PS C:\Users\paulh\Documents\BuildKit\supersimpleDocker> buildctl --debug build --frontend=dockerfile.v0 --local context=. --local dockerfile=.
time="2020-01-06T08:03:16+11:00" level=debug msg="serving grpc connection"
[+] Building 4.7s (4/5)
[+] Building 4.7s (5/5) FINISHED
=> [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 588B 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => [internal] load metadata for mcr.microsoft.com/windows/servercore:1909 0.2s => CACHED [1/2] FROM mcr.microsoft.com/windows/servercore:1909@sha256:12327ccba5d74921479cc95b56e9422278ac3565740c2a46 0.0s => => resolve mcr.microsoft.com/windows/servercore:1909@sha256:12327ccba5d74921479cc95b56e9422278ac3565740c2a46359bf0a 0.0s => ERROR [2/2] RUN echo Write-Host -ForegroundColor Red Hello > wr.ps1 4.4s ------
> [2/2] RUN echo Write-Host -ForegroundColor Red Hello > wr.ps1:
------
error: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to build LLB: executor failed running [powershell -command echo Write-Host -ForegroundColor Red Hello > wr.ps1]: failure waiting for process: rpc error: code = Unknown desc = ttrpc: closed: unknown
failed to solve
github.com/moby/buildkit/client.(*Client).solve.func2
C:/Users/paulh/go/src/github.com/moby/buildkit/client/solve.go:203
github.com/moby/buildkit/vendor/golang.org/x/sync/errgroup.(*Group).Go.func1
C:/Users/paulh/go/src/github.com/moby/buildkit/vendor/golang.org/x/sync/errgroup/errgroup.go:57
runtime.goexit
c:/go/src/runtime/asm_amd64.s:1357
I've pushed one commit that needs more work (breaks the auto tests) plus my hacks onto https://github.com/TBBle/buildkit/tree/hacks_ahoy, in case anyone else wants to play with this.
For reference, I was working with source from containerd/containerd#3929, to fix a blocking bug and Microsoft/hcsshim#749, to let me build without gcc. For hcshim, had I not been instrumenting the source, I could have used the nightly binary build of the containerd shim, and I'm planning to suggest/submit that their releases include pushing a container for the container managed /opt feature, which would avoid hunting down binaries and adding them to the $PATH
. (Edit: Microsoft/hcsshim#750)
The failure I hit in my previous run turned out to be a bug in hcsshim, for which I have posted a fix at microsoft/hcsshim#752.
So now I am able to build a trivial Dockerfile. So trivial it's pointless, except that it worked.
FROM mcr.microsoft.com/windows/servercore:1909
LABEL Description="Built with BuildKit!"
SHELL ["powershell", "-command"]
RUN echo Write-Host -ForegroundColor Red Hello > wr.ps1
CMD ["powershell" ".\wr1.ps1"]
I don't know yet if my containers do not have networking set up properly due to my Buildkit spec-generation hacks, or some other aspect of my setup unrelated to Buildkit.
As well as networking issues, filesystem commands do not function on Windows due to an assertion about idmapping support.
I was worried about API issues, so I had vendored containerd master into buildkit, and hcsshim master into containerd. However, I suspect that this wasn't necessary, and I'll back those out next time I look at this.
I've rebased https://github.com/TBBle/buildkit/tree/hacks_ahoy to the current version of #1314, so it should be relatively easy for anyone who wants to try this out, and perhaps try and turn some of my hacks into further valuable commits.
@TBBle cool to see someone tackling this. Does your fork handles the alternative <pathOfDockerfile>.dockerignore
path for .dockerignore files? That is pretty much the only thing I miss for the moment.
It probably doesn't, but only because all the file-copy APIs in BuildKit fail an assertion on Windows related to permissions support.
I really should get back to this, it got jammed up behind questions about containerd 1.2 support, and then other stuff came up.
There is an issue logged on Microsoft Windows Containers repo https://github.com/microsoft/Windows-Containers/issues/34
Now I'm looking at this again, I realise I previously only tested building into the buildkit cache.
Outputting also does work:
- image, oci, and docker outputs all filed calculating diff pairs due to something not being implemented in containerd. Not sure if this is actually a missing feature, or we just need to use a different containerd API on Windows, like in the mounting. Edit: Looks like a containerd missing feature: https://github.com/containerd/containerd/issues/4394
- tar and local outputs just capture the sandbox.vhdx for the top layer (an internal detail of the HCS) rather than the contents of the image, as one would expect. Probably related to assumptions around the mount behaviour, which I'm already working around in the container-mounting support.
I got image, oci, and docker outputs working in containerd in https://github.com/containerd/containerd/pull/4399, so I can now run the (trivial) images I build. So then back to working out how to do non-trivial things in the build script, next week. With a bit of luck I'm now free of any further containerd issues or unimplemented features.
FROM mcr.microsoft.com/windows/servercore:2004
LABEL Description="Built with BuildKit!"
SHELL ["powershell", "-command"]
ENTRYPOINT ["powershell"]
RUN echo "Write-Host -ForegroundColor DarkGreen Hello World" > C:/wr.ps1
CMD ["-command", "C:/wr.ps1"]
buildctl build --frontend dockerfile.v0 --local context=. --local dockerfile=. --output type=image,name=supersimpledocker,oci-mediatypes=true
ctr --namespace buildkit run --rm --tty supersimpledocker tm1
Small progress report. I now have networking functional for the containerd worker under Windows. It's a minor hassle to set up using BuildKit and containerd directly (as you have to source and configure a CNI plugin yourself, and the Windows CNI landscape is... rough), but Docker provides its own managed network stack to use with BuildKit, so once someone implements the Docker side of the Buildkit integration, it won't be any more hassle than networking under any other setup.
No containerd changes this time, as containerd happily uses whatever CNI setup you pass it.
I now have the below functioning, see #1585 for details.
FROM mcr.microsoft.com/windows/servercore:2004
LABEL Description="Python" Vendor="Python Software Foundation" Version="3.7.3"
RUN powershell.exe -Command \
$ErrorActionPreference = 'Stop'; \
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12; \
wget https://www.python.org/ftp/python/3.7.3/python-3.7.3.exe -OutFile c:\python-3.7.3.exe ; \
Start-Process c:\python-3.7.3.exe -ArgumentList '/quiet InstallAllUsers=1 PrependPath=1' -Wait ; \
Remove-Item c:\python-3.7.3.exe -Force
It occurs to me... I'm only testing with the containerd backend. Is there any interest in the runc executor working (using runhcs)? I feel like there's a movement away from using runhcs, and I'm not totally sure that this would avoid the use of containerd anyway, as things like the layer differ go through it; I haven't looked at what the runc executor does in this case.
@TBBle Ideally both would work like in Linux but one is not a requirement for the other. It seems to me that worker that doesn't depend on containerd would be even simpler to get working. We should still reuse as much containerd code as possible and avoid duplication. For the differ, this is what Linux side does as well - it still uses the containerd differ, just it uses the library directly that is vendored into buildkit instead of the grpc API to containerd daemon.
@TBBle we should also probably prioritize getting some CI running. It is quite hard for all of the current maintainers to actually test any of these changes. It is fine if the current test suite almost doesn't pass. We can start with some basics like the example you had above. I'm not quite sure how well the CI workers support wcow. Eventually, we probably want to switch from travis to github actions but we have some build-cache logic that can't be very easily transferred so it will take time. If Github actions support what is needed for this we could initially do something special there for windows only.
The main blocker (my last remaining hack) for bringing this up in CI is refactoring GenerateSpec
to not add any Linux
elements to the spec, as that triggers LCOW mode.
That's my next task anyway, since that's the last change in my "hacks_ahoy" branch. Once that's in-place, I plan to start trying out the various tests on CI and see which pass. There's still an unmeasured pile of work to make the in-build filesystem support work (I know it currently fails due to rejecting attempts to set permissions), but hopefully I can identify a subset of the tests that can pass.
A problem for using the vendored containerd for client-side diffing in the runc executor is that the vendored containerd is 1.3, which doesn't support diffing windows-layers, as that code is only in a PR I have open against containerd master, and I'm hoping it'll land in time for containerd 1.4 to be branched, although the beta series has already started and I don't know how much risk containerd will wear between betas.
I see BuildKit has a filesystem-only differ for windows-layers used on non-Windows platforms; I'm not sure whether it is a viable alternative to the hcs-based tar streaming used on Windows in the meantime, as I haven't looked closely at what differences it might have, c.f. https://github.com/containerd/containerd/pull/4399#issuecomment-660283335
@TBBle The vendored containerd does not need to be stable release. We mostly vendor master to get the latest fixes. For the differ, I doubt the current windows-layers thing is usable. It is just for handling the different tar format(windows has a parent Hives/Files directories). Opened an issue to support it natively in https://github.com/containerd/containerd/issues/2469 as well so we don't need a hack. It would be nice if we could do the opposite as well(build Linux layers in windows) but that is not a priority atm of course.