continuous-integration
                                
                                 continuous-integration copied to clipboard
                                
                                    continuous-integration copied to clipboard
                            
                            
                            
                        Unexpected failures on windows runners
Initially noticed on https://github.com/bazelbuild/rules_rust/pull/879#issuecomment-894896993
Did something change about the windows runners recently? As you can see in that comment, build 4029 succeeded but build 4046 failed when no changes had been made to that branch.
cc @hlopko for visibility
Hi,
yes, I did an upgrade of our VMs and containers over the weekend - for Windows I summarize the changes here: https://github.com/bazelbuild/bazel/issues/13816
We‘ve seen a couple of issues with the new image and are currently fixing broken tests etc. across CI.
That failure is strange. We didn’t change anything about Visual Studio except the version of the Windows SDK in the new images. :/
Oh… maybe the failing test is accidentally using a „link“ executable from MSYS while you’re expecting a link.exe from VStudio?
@meteorcloudy Could we remove the mingw32 compiler, gcc, etc. from MSYS2? Why do we need / install them when we want to build with Visual Studio on Windows? (Not sure that that’s where the /usr/bin/link is coming from, but it’s my guess)
Oh no, link.exe actually comes from https://packages.msys2.org/package/coreutils?repo=msys&variant=x86_64 … :/
Yeah, the /usr/bin/link should exist even before the upgrade of VM, so it must be something else that caused the build to find the unix link instead of the one from VC++ build tool.
This is how rust finds the link.exe binary: https://github.com/rust-lang/rust/blob/eaf6f463599df1f18da94a6965e216ea15795417/compiler/rustc_codegen_ssa/src/back/link.rs#L851
I'm not very familiar with rust, cannot figure out what went wrong, @UebelAndre can you help take a look?
I may have some time toward the end of the day today to take a closer look but otherwise don't have too much free time this week so can't commit to too much support 😞
Basically this API doesn't work correctly: https://docs.rs/cc/1.0.29/cc/windows_registry/fn.find_tool.html
I wonder if the registry is somehow messed up?
rules_rust take the linker from the C++ toolchain, we don't use the rustc defaults. This is where this happens: https://github.com/bazelbuild/rules_rust/blob/main/rust/private/rustc.bzl#L229.
I'm hopeless when it comes to Windows, but to me it seems the error is not directly Rust related. @meteorcloudy, does the invocation look reasonable to you?
"link.exe" "/NOLOGO" "C:\\temp\\rustdoctest1qTZce\\rust_out.rust_out.7rcbfp3g-cgu.0.rcgu.o" "C:\\temp\\rustdoctest1qTZce\\rust_out.33dyzt1ekirinwy8.rcgu.o" "/LIBPATH:C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib" "C:\\b\\yzf2u4jt\\execroot\\rules_rust\\bazel-out\\x64_windows-fastbuild\\bin\\external\\examples\\fibonacci\\libfibonacci--608111072.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libstd-3d786a338e3fbd3c.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libpanic_unwind-c7722f94ca812e0f.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libstd_detect-f6ac1aae8e3d5b95.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\librustc_demangle-8244d5c29082f380.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libhashbrown-c29ed8b388a545d6.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\librustc_std_workspace_alloc-daec0207219073db.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libunwind-e1164c8529217a2a.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libcfg_if-78991d36592dccee.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\liblibc-3e2bb97c5be118b7.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\liballoc-d5bd6400adb9fa95.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\librustc_std_workspace_core-07dcecfd1f459221.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libcore-f0c150dc0abba70a.rlib" "C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libcompiler_builtins-0f3806ca1d72c7be.rlib" "kernel32.lib" "ws2_32.lib" "advapi32.lib" "userenv.lib" "kernel32.lib" "msvcrt.lib" "/NXCOMPAT" "/LIBPATH:C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib" "/OUT:C:\\temp\\rustdoctest1qTZce\\rust_out" "/OPT:REF,NOICF" "/DEBUG" "/NATVIS:C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\etc\\intrinsic.natvis" "/NATVIS:C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\etc\\liballoc.natvis" "/NATVIS:C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\etc\\libcore.natvis" "/NATVIS:C:\\b\\yzf2u4jt\\external\\rust_windows_x86_64\\lib\\rustlib\\etc\\libstd.natvis"
We also take linker flags from the C++ toolchain, but we construct flags for libraries to link ourselves here: https://github.com/bazelbuild/rules_rust/blob/main/rust/private/rustc.bzl#L879. Is that logic correct?
How would the error look like if there were undefined symbols or cyclic dependencies between libraries? Can it be that we're hitting that problem? Can this be related to https://github.com/bazelbuild/rules_rust/issues/637? One way to debug this further would be to replace rust_binary with a cc_binary and:
- don't use rust_binary/rust_test for the test, use rust_library
- add a cc_binary that depends on the rust_library
- set crate_root in the rust_library to point to the main.rs
- in main.rs, replace fn main() {...}with
#[no_mangle]
extern "C" fn main() {...}
If ^ works when C++ constructs the linking action, we can diff the command line and see further where things went wrong.
Interesting, thanks Marcel!
We can see in the logs that the "outer" Bazel can successfully build Rust files (at least there are a few messages that suggest that, like "(01:17:50) INFO: From Compiling Rust rlib libc (59 files): [...]" and "Compiling Rust bin cargo_build_script_runner (1 files); 1s local, remote-cache".
But then the "inner" Bazel in the (shell) tests fails when building Rust.
My guess what happens is that inside the shell tests, the PATH is wrong and puts MSYS2's /usr/bin before the other directories and as the command-line does not call link.exe using an absolute path, it uses MSYS2's binary instead of the one provided by Visual Studio.
Oh wow I've never looked into the implementation of rust_doc_test, I didn't know about the shell script inside. Still, there doesn't seem to be outer/inner Bazel, just potentially slightly different compilation actions.
@UebelAndre can you maybe take a look if regular rustc actions and rust_doc compilation actions differ in their env or in paths used? That could be an explanation.
The difference is that rust_doc is attempting to run a compile action as a test. rustdoc will compile the doc tests and then run them. Unfortunately, there's no stable way for rustdoc to build tests and run them later so I think the question becomes, what can be done to recreate the same environment used for Rustc actions in this test target's execution? In attempting this, I ran into a myriad of issues which might be solvable but I don't currently know how or even if I've encountered all the issues needed to get that working.