ue4-docker
ue4-docker copied to clipboard
Stuck on => [builder 13/20] RUN ./Engine/Build/BatchFiles/RunUAT.sh BuildGraph -target="Make Installed Build Linux" -script=Engine/Build/InstalledEngineBuild.xml -set:HostPlatformOnly=true
Output of the ue4-docker info
command:
Me@My-MBP ~ % ue4-docker info
/Library/Python/3.9/site-packages/urllib3/__init__.py:34: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
warnings.warn(
ue4-docker version: 0.0.111 (latest available version is 0.0.111)
Operating system: macOS 14.2.1 (Kernel Version 23.2.0)
Docker daemon version: 24.0.7
NVIDIA Docker supported: No
Maximum image size: No limit detected
Available disk space: Unknown (typically means the Docker daemon is running in a Moby VM, e.g. Docker Desktop)
Total system memory: 128 GiB physical, 1 GiB virtual
CPU: 16 physical, 16 logical (arm)
Additional details:
- Are you accessing the network through a proxy server? No
- Full Command:
ue4-docker build custom:UE532v0.0.1 -repo=https://github.com/dev-fredericfox/UnrealEngine_release.git -branch=main -username=dev-fredericfox -password=ghp_REDACTED --exclude templates --monitor
The RUN ./Engine/Build/BatchFiles/RunUAT.sh BuildGraph -target="Make Installed Build Linux"
process just randomly stops at some point. Usually between 200/3994 and 600/3994, without any consistency or specified reason. It just keeps running without ever progressing.
Example 01 Stuck at 234
[+] Building 1070.9s (18/33) docker:desktop-linux
=> [internal] load .dockerignore 0.0s
=> => transferring context: 53B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 8.11kB 0.0s
=> [internal] load metadata for docker.io/adamrehn/ue4-build-prerequisites:opengl-ubuntu22.04 0.0s
=> [internal] load metadata for docker.io/adamrehn/ue4-source:wyrdue532v0.0.1-opengl-ubuntu22.04 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 307B 0.0s
=> CACHED [stage-1 1/8] FROM docker.io/adamrehn/ue4-build-prerequisites:opengl-ubuntu22.04 0.0s
=> [builder 1/20] FROM docker.io/adamrehn/ue4-source:wyrdue532v0.0.1-opengl-ubuntu22.04 0.0s
=> CACHED [builder 2/20] COPY set-changelist.py /tmp/set-changelist.py 0.0s
=> CACHED [builder 3/20] RUN python3 /tmp/set-changelist.py /home/ue4/UnrealEngine/Engine/Build/Build.version $CHANGELIST && echo '' && echo 'RUN directive complete. Docker will now commit the filesystem layer to d 0.0s
=> CACHED [builder 4/20] RUN rm -rf /home/ue4/UnrealEngine/.git && echo '' && echo 'RUN directive complete. Docker will now commit the filesystem layer to disk.' && echo 'Note that for large filesystem layers this 0.0s
=> CACHED [builder 5/20] COPY enable-opengl.py /tmp/enable-opengl.py 0.0s
=> CACHED [builder 6/20] RUN python3 /tmp/enable-opengl.py /home/ue4/UnrealEngine/Engine/Config/BaseEngine.ini && echo '' && echo 'RUN directive complete. Docker will now commit the filesystem layer to disk.' && ec 0.0s
=> CACHED [builder 7/20] COPY patch-filters-xml.py /tmp/patch-filters-xml.py 0.0s
=> CACHED [builder 8/20] RUN python3 /tmp/patch-filters-xml.py /home/ue4/UnrealEngine/Engine/Build/InstalledEngineFilters.xml && echo '' && echo 'RUN directive complete. Docker will now commit the filesystem layer 0.0s
=> CACHED [builder 9/20] COPY patch-build-graph.py /tmp/patch-build-graph.py 0.0s
=> CACHED [builder 10/20] RUN python3 /tmp/patch-build-graph.py /home/ue4/UnrealEngine/Engine/Build/InstalledEngineBuild.xml /home/ue4/UnrealEngine/Engine/Build/Build.version && echo '' && echo 'RUN directive comple 0.0s
=> CACHED [builder 11/20] RUN ./Engine/Build/BatchFiles/Linux/Build.sh ShaderCompileWorker Linux Development -SkipBuild -buildubt && echo '' && echo 'RUN directive complete. Docker will now commit the filesystem lay 0.0s
=> CACHED [builder 12/20] WORKDIR /home/ue4/UnrealEngine 0.0s
=> [builder 13/20] RUN ./Engine/Build/BatchFiles/RunUAT.sh BuildGraph -target="Make Installed Build Linux" -script=Engine/Build/InstalledEngineBuild.xml -set:HostPlatformOnly=true -set:WithDDC=tru 1070.9s
=> => # [229/3994] Compile Module.Chaos.3.cpp
=> => # [230/3994] Link (lld) libUnrealEditor-TextureBuildUtilities.so
=> => # [231/3994] Compile Module.Chaos.10.cpp
=> => # [232/3994] Compile Module.AppFramework.3.cpp
=> => # [233/3994] Compile Module.OpenColorIOWrapper.cpp
=> => # [234/3994] Link (lld) libUnrealEditor-OpenColorIOWrapper.so
Example 02 Stuck at 476:
[+] Building 1592.6s (18/33) docker:desktop-linux
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 8.11kB 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 53B 0.0s
=> [internal] load metadata for docker.io/adamrehn/ue4-build-prerequisites:opengl-ubuntu22.04 0.0s
=> [internal] load metadata for docker.io/adamrehn/ue4-source:wyrdue532v0.0.1-opengl-ubuntu22.04 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 307B 0.0s
=> CACHED [stage-1 1/8] FROM docker.io/adamrehn/ue4-build-prerequisites:opengl-ubuntu22.04 0.0s
=> [builder 1/20] FROM docker.io/adamrehn/ue4-source:wyrdue532v0.0.1-opengl-ubuntu22.04 0.0s
=> CACHED [builder 2/20] COPY set-changelist.py /tmp/set-changelist.py 0.0s
=> CACHED [builder 3/20] RUN python3 /tmp/set-changelist.py /home/ue4/UnrealEngine/Engine/Build/Build.version $CHANGELIST && echo '' && echo 'RUN directive complete. Docker will now commit the filesystem layer to d 0.0s
=> CACHED [builder 4/20] RUN rm -rf /home/ue4/UnrealEngine/.git && echo '' && echo 'RUN directive complete. Docker will now commit the filesystem layer to disk.' && echo 'Note that for large filesystem layers this 0.0s
=> CACHED [builder 5/20] COPY enable-opengl.py /tmp/enable-opengl.py 0.0s
=> CACHED [builder 6/20] RUN python3 /tmp/enable-opengl.py /home/ue4/UnrealEngine/Engine/Config/BaseEngine.ini && echo '' && echo 'RUN directive complete. Docker will now commit the filesystem layer to disk.' && ec 0.0s
=> CACHED [builder 7/20] COPY patch-filters-xml.py /tmp/patch-filters-xml.py 0.0s
=> CACHED [builder 8/20] RUN python3 /tmp/patch-filters-xml.py /home/ue4/UnrealEngine/Engine/Build/InstalledEngineFilters.xml && echo '' && echo 'RUN directive complete. Docker will now commit the filesystem layer 0.0s
=> CACHED [builder 9/20] COPY patch-build-graph.py /tmp/patch-build-graph.py 0.0s
=> CACHED [builder 10/20] RUN python3 /tmp/patch-build-graph.py /home/ue4/UnrealEngine/Engine/Build/InstalledEngineBuild.xml /home/ue4/UnrealEngine/Engine/Build/Build.version && echo '' && echo 'RUN directive comple 0.0s
=> CACHED [builder 11/20] RUN ./Engine/Build/BatchFiles/Linux/Build.sh ShaderCompileWorker Linux Development -SkipBuild -buildubt && echo '' && echo 'RUN directive complete. Docker will now commit the filesystem lay 0.0s
=> CACHED [builder 12/20] WORKDIR /home/ue4/UnrealEngine 0.0s
=> [builder 13/20] RUN ./Engine/Build/BatchFiles/RunUAT.sh BuildGraph -target="Make Installed Build Linux" -script=Engine/Build/InstalledEngineBuild.xml -set:HostPlatformOnly=true -set:WithDDC=tru 1592.6s
=> => # [471/3994] Compile Module.Engine.18.cpp
=> => # [472/3994] Compile Module.Engine.59.cpp
=> => # [473/3994] Compile Module.Engine.12.cpp
=> => # [474/3994] Compile Module.Engine.15.cpp
=> => # [475/3994] Compile Module.Engine.20.cpp
=> => # [476/3994] Compile Module.Engine.65.cpp
How much RAM/CPUs is allocated to Docker VM?
How much RAM/CPUs is allocated to Docker VM?
50gb/16 CPUs/880gb disk
One of the issues seems to be related to Multithreading, although this is really not my area of expertise. When I reduce my Docker CPUs to 1 I don't get stuck in the compiling phase. (This however take several days). However now the output is clipped, so I am not sure how to proceed to keep debugging.
Output when running only one 1 CPU:
=> [builder 13/20] RUN ./Engine/Build/BatchFiles/RunUAT.sh BuildGraph -target="Make Installed Build Linux" -script=Engine/Build/InstalledEngin 235804.3s
=> => # LogShaderCompilers: Display: TBasePassPSFNoLightMapPolicySkylight - 5.17% of total time (compiled 38 times, average 77.80 sec,
=> => # max 236.41 sec, min 44.25 sec)
=> => # LogShaderCompilers: Display: TBasePassPSFPrecomputedVolumetricLightmapLightingPolicySkylight - 4.47% of total time (compiled 32 times, average 79.90 se
=> => # c, max 239.03 sec, min 51.85 sec)
=> => # Log
=> => # [output clipped, log limit 2MiB reached]
Given that you have plenty of RAM, it might possibly be easier to spin up a Linux VM and run ue4-docker inside it.
When running only one CPU (which is also the default when doing the build in Hyper-V isolation on Windows) you may be bitten by an issue in the UE build system where the build management system keeps a whole CPU core busy checking for progress from the shader compiler processes, and hence the shader compiler processes themselves actually get little-to-no CPU time and don't progress, leading to an unexpected 10's-of-hours build.
I only found this in UE4, I sent them a bug report, but I don't recall them accepting my fix (a sleep in the loop checking for Shader Compiler progress) or otherwise addressing it; a single-core development environment is not supported after all. In a multi-core environment, I couldn't demonstrate build-time improvement from my fix either, which surprised me.
But that sounds like what you hit here in your single-CPU attempt. So maybe try with two cores, see if that avoids the compile hang and also the shader compiler issue.
I see. I tested with 2 CPU cores and sadly it gets stuck fairly early in the process.
Sometimes when I cancel the process after being stuck I notice this error message (not always). Could be related?
That error is Python seeing a Control-C in a thread, presumably because you're hitting Control-C to cancel the process, I don't believe it's related.
@slonopotamus
Given that you have plenty of RAM, it might possibly be easier to spin up a Linux VM and run ue4-docker inside it.
Trying that right now, but first tests show that even in a UTM VM (Debian 12 Rosetta Virtualization) it still gets stuck. I will try an emulation later, but the performance is going to be horrendous.
Emulation seems to be broken as well. Or at least it's unreasonable to use. Been "stuck" without progress on step [builder 11/20] for the past 8 hours, fans on full blast. Are people running this package primarily on windows or why does it seem to only affect me?
We either run natively on Windows or on Linux.
We either run natively on Windows or on Linux.
But only amd64 or does it work on linux arm?
I think I found a "sort of" workaround for now.
When step 13 fails, I run docker -it
and run the the steps of 13 manually from inside the container. When the compiling freezes I kill the tasks, and since make is incremental, I just relaunch it. Looks to be working for now.
Only question is: How do I commit this stage manually to the layer to proceed to step 14? I could do a docker commit
but AFAIK this creates a new container, how will the ue4-docker script know to look for the container with the manually committed changes?
Any input appreciated!
The only way you could use docker commit
and then continue the image build from there would be to change the Dockerfile to have a FROM
for that created container at that point. It seems like a lot of hassle.
Can you use docker exec
to inspect the hung container build stage with top or similar? (I honestly don't remember if you can do that...) I kind-of suspect this is an Unreal-level bug, some kind of shared resource or busy-wait that's deadlocking. If your CPU load is causing your fans to run, then it thinks its doing something and as I mentioned earlier, I know of at least one busy-wait that used to exist in the system, and may still do.
(Actually, you can use top
from outside the container to inspect the processes inside it, but I believe the defaults hid processes in different PID namespaces...)
Oh, right, you can reproduce this in a docker run
, so you can definitely docker exec
in and use top
to inspect that state.
My guess is that you've got all your cores busy-waiting, and no actual build processes are advancing. The fact that two-cores get stuck early suggests that this is the case, that the build is accumulating more busy-waiters over time, until they luck-out and fill all the available cores simultaneously. If that turns out to be the case, it may be possible to renice
the busy-waiters from outside the container, in order to get the build to resume progress. That'll be a little fiddly, but less-so than trying to inject the manually-built container into the build workflow.