ZLUDA icon indicating copy to clipboard operation
ZLUDA copied to clipboard

[bug] Failed to build ZLUDA on Windows

Open qigel opened this issue 1 year ago • 1 comments

Can't build ZLUDA on Windows. Building fails on llvm-sys.

Git 2.43.0.windows.1 Cmake 3.29.0-rc1 Python 3.12.0 Rust 1.76.0 AMD software adrenalin edition 24.1.1

Error log:

error: failed to run custom build command for `llvm-sys v150.1.2 (C:\AMD\zluda\ext\llvm-sys.rs)`

Caused by:
  process didn't exit successfully: `C:\AMD\zluda\target\release\build\llvm-sys-23a1a7a99206415b\build-script-build` (exit code: 101)
  --- stderr
  CMake Deprecation Warning at CMakeLists.txt:8 (cmake_policy):
    The OLD behavior for policy CMP0114 will be removed from a future version
    of CMake.

    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.


  CMake Deprecation Warning at CMakeLists.txt:13 (cmake_policy):
    The OLD behavior for policy CMP0116 will be removed from a future version
    of CMake.

    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.


  CMake Warning at C:/AMD/zluda/ext/llvm-project/third-party/benchmark/CMakeLists.txt:308 (message):
    Using std::regex with exceptions disabled is not fully supported


  CMake Deprecation Warning at CMakeLists.txt:8 (cmake_policy):
    The OLD behavior for policy CMP0114 will be removed from a future version
    of CMake.

    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.


  CMake Deprecation Warning at CMakeLists.txt:13 (cmake_policy):
    The OLD behavior for policy CMP0116 will be removed from a future version
    of CMake.

    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.


  CMake Warning at C:/AMD/zluda/ext/llvm-project/third-party/benchmark/CMakeLists.txt:308 (message):
    Using std::regex with exceptions disabled is not fully supported


  CMake Deprecation Warning at CMakeLists.txt:8 (cmake_policy):
    The OLD behavior for policy CMP0114 will be removed from a future version
    of CMake.

    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.


  CMake Deprecation Warning at CMakeLists.txt:13 (cmake_policy):
    The OLD behavior for policy CMP0116 will be removed from a future version
    of CMake.

    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.


  CMake Warning at C:/AMD/zluda/ext/llvm-project/third-party/benchmark/CMakeLists.txt:308 (message):
    Using std::regex with exceptions disabled is not fully supported


  CMake Deprecation Warning at CMakeLists.txt:8 (cmake_policy):
    The OLD behavior for policy CMP0114 will be removed from a future version
    of CMake.

    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.


  CMake Deprecation Warning at CMakeLists.txt:13 (cmake_policy):
    The OLD behavior for policy CMP0116 will be removed from a future version
    of CMake.

    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.


  CMake Warning at C:/AMD/zluda/ext/llvm-project/third-party/benchmark/CMakeLists.txt:308 (message):
    Using std::regex with exceptions disabled is not fully supported


  CMake Deprecation Warning at CMakeLists.txt:8 (cmake_policy):
    The OLD behavior for policy CMP0114 will be removed from a future version
    of CMake.

    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.


  CMake Deprecation Warning at CMakeLists.txt:13 (cmake_policy):
    The OLD behavior for policy CMP0116 will be removed from a future version
    of CMake.

    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.


  CMake Warning at C:/AMD/zluda/ext/llvm-project/third-party/benchmark/CMakeLists.txt:308 (message):
    Using std::regex with exceptions disabled is not fully supported


  CMake Deprecation Warning at CMakeLists.txt:8 (cmake_policy):
    The OLD behavior for policy CMP0114 will be removed from a future version
    of CMake.

    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.


  CMake Deprecation Warning at CMakeLists.txt:13 (cmake_policy):
    The OLD behavior for policy CMP0116 will be removed from a future version
    of CMake.

    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.


  CMake Warning at C:/AMD/zluda/ext/llvm-project/third-party/benchmark/CMakeLists.txt:308 (message):
    Using std::regex with exceptions disabled is not fully supported


  thread 'main' panicked at ext\llvm-sys.rs\build.rs:103:10:
  called `Result::unwrap()` on an `Err` value: Os { code: 3, kind: NotFound, message: "Системе не удается найти указанный путь." }
  stack backtrace:
     0:     0x7ff65c6fa142 - std::sys_common::backtrace::_print::impl$0::fmt
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\sys_common\backtrace.rs:44
     1:     0x7ff65c7173ed - core::fmt::rt::Argument::fmt
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\core\src\fmt\rt.rs:142
     2:     0x7ff65c7173ed - core::fmt::write
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\core\src\fmt\mod.rs:1120
     3:     0x7ff65c6f6181 - std::io::Write::write_fmt<std::sys::windows::stdio::Stderr>
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\io\mod.rs:1810
     4:     0x7ff65c6f9f6a - std::sys_common::backtrace::_print
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\sys_common\backtrace.rs:47
     5:     0x7ff65c6f9f6a - std::sys_common::backtrace::print
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\sys_common\backtrace.rs:34
     6:     0x7ff65c6fc6c9 - std::panicking::default_hook::closure$1
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\panicking.rs:272
     7:     0x7ff65c6fc385 - std::panicking::default_hook
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\panicking.rs:292
     8:     0x7ff65c6fcbf4 - std::panicking::rust_panic_with_hook
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\panicking.rs:779
     9:     0x7ff65c6fcac9 - std::panicking::begin_panic_handler::closure$0
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\panicking.rs:657
    10:     0x7ff65c6faa49 - std::sys_common::backtrace::__rust_end_short_backtrace<std::panicking::begin_panic_handler::closure_env$0,never$>
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\sys_common\backtrace.rs:171
    11:     0x7ff65c6fc792 - std::panicking::begin_panic_handler
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\panicking.rs:645
    12:     0x7ff65c71cba7 - core::panicking::panic_fmt
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\core\src\panicking.rs:72
    13:     0x7ff65c71d003 - core::result::unwrap_failed
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\core\src\result.rs:1649
    14:     0x7ff65c63a8f0 - enum2$<core::result::Result<std::process::Output,std::io::error::Error> >::unwrap<std::process::Output,std::io::error::Error>
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce\library\core\src\result.rs:1073
    15:     0x7ff65c63c6fd - build_script_build::emit_compile_and_linking_information<core::iter::adapters::map::Map<core::slice::iter::Iter<ref$<str$> >,build_script_build::main::closure_env$0> >
                                 at C:\AMD\zluda\ext\llvm-sys.rs\build.rs:99
    16:     0x7ff65c63b82a - build_script_build::main
                                 at C:\AMD\zluda\ext\llvm-sys.rs\build.rs:21
    17:     0x7ff65c6420cb - core::ops::function::FnOnce::call_once<void (*)(),tuple$<> >
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce\library\core\src\ops\function.rs:250
    18:     0x7ff65c641d3e - core::hint::black_box
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce\library\core\src\hint.rs:286
    19:     0x7ff65c641d3e - std::sys_common::backtrace::__rust_begin_short_backtrace<void (*)(),tuple$<> >
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce\library\std\src\sys_common\backtrace.rs:155
    20:     0x7ff65c63df11 - std::rt::lang_start::closure$0<tuple$<> >
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce\library\std\src\rt.rs:166
    21:     0x7ff65c6f20d2 - std::rt::lang_start_internal::closure$2
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\rt.rs:148
    22:     0x7ff65c6f20d2 - std::panicking::try::do_call
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\panicking.rs:552
    23:     0x7ff65c6f20d2 - std::panicking::try
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\panicking.rs:516
    24:     0x7ff65c6f20d2 - std::panic::catch_unwind
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\panic.rs:142
    25:     0x7ff65c6f20d2 - std::rt::lang_start_internal
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library\std\src\rt.rs:148
    26:     0x7ff65c63deea - std::rt::lang_start<tuple$<> >
                                 at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce\library\std\src\rt.rs:165
    27:     0x7ff65c63d0c9 - main
    28:     0x7ff65c71b598 - invoke_main
                                 at d:\a01\_work\6\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
    29:     0x7ff65c71b598 - __scrt_common_main_seh
                                 at d:\a01\_work\6\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    30:     0x7ff9e3637344 - BaseThreadInitThunk
    31:     0x7ff9e38e26b1 - RtlUserThreadStart
error: process didn't exit successfully: `target\debug\xtask.exe --release` (exit code: 101)

It seems that builder missed something, but what?

qigel avatar Feb 16 '24 13:02 qigel

did you install ninja?

dayo7116 avatar Feb 22 '24 03:02 dayo7116

I had problem with installing Ninja, but I'll try again with it

qigel avatar Feb 27 '24 08:02 qigel

Thanks, I installed Ninja 1.11.1, and Zluda was built successfully. So need to remove "Recomended, optional" from README near Ninja

qigel avatar Feb 27 '24 09:02 qigel

Getting the same error, how exactly do you install Ninja ? I tried placing ninja.exe in ZLUDA folder and it didn't do anything. I know CMAKE needs to be used with -G Ninja but how to properly add it.

EDIT: I just had to place it in a folder that's in PATH

Getting a new error src/lib.cpp(7): fatal error C1083: Cannot open include file: 'llvm-c/Core.h': No such file or directory tho. Seems like whitespaces in my build path caused it

xCuri0 avatar Mar 15 '24 18:03 xCuri0

#178 should fix this issue properly. Unfortunately whitespace in build path will still fail as it is a deeper issue with interaction between LLVM build system and Rust C/C++ compiler crate

vosen avatar Mar 17 '24 01:03 vosen

I also had another issue when building it again about linking with xml2. I fixed that by disabling LLVM_ENABLE_LIBXML2 in ext/llvm-sys.rs/build.rs

Btw do you plan on updating the HIP-RT version included in the future ? Because that's what I'm trying to do rn (got it to the point where it's giving header errors during HIP runtime compilation)

xCuri0 avatar Mar 17 '24 01:03 xCuri0

I don't plan to update HIP-RT. It's a lot of work for no benefit. Newer HIP-RT will not make ZLUDA-OptiX any better. Biggest problems are either with ZLUDA project itself (missing support for feature X, missing optimizations) or AMDGPU LLVM backend (miscompilations). Current version of ZLUDA-OptiX is tied to the current version and current behavior of bundled HIP-RT: it doesn't go enitrely through public APIs. If you wanted to update HIP-RT, you probably need to:

  • Regenerate host function declarations in hiprt-sys (command in Makefile.toml). Adjust to changes
  • Figure out how kernel code is different. As mentioned, ZLUDA-OptiX doesn't always respect public API and does random arbitrary hacks for performance and compatibility by reusing some of private HIP-RT gpu code. Bundled HIP-RT is closed source, but you can dump gpu sources with comgr env vars (AMD_COMGR_SAVE_TEMPS=1 AMD_COMGR_EMIT_VERBOSE_LOGS=1 AMD_COMGR_REDIRECT_LOGS=stderr). You are interested in zluda_rt_ptx_impl.cpp and all the .cpp and .hpp files in the same dir

AFAIK there's been an API break between bundled HIP-RT and the newest one, so code changes are going to be non-trivial

vosen avatar Mar 17 '24 13:03 vosen

Yeah it does seem to use some non-public code in the GPU kernels from the old HIP-RT version, I don't think anyone has else been able to build it ?

will have to adapt those to the new open source version (got everything else seemingly working, the env vars you sent helped fixing some issues)

also currently using a very hacky workaround of custom built HIP RT because hiprtBuildTraceKernels uses an std::vector as argument which is complicated to setup in Rust

Edit: been able to get it to the point of getting hipModuleLoadData: Returned hipErrorInvalidKernelFile after compile succeeds, probably going to be hard to fix this. Custom HIP RT build isn't required anymore through use of a C++ wrapper to use std::vector (i'm sure there's a better way to do this but this was easiest)

xCuri0 avatar Mar 17 '24 17:03 xCuri0

If anyone wants to try to continue it you can check https://github.com/xCuri0/ZLUDA/tree/hiprt-2, use HIP-RT built from the recently released source code. Maybe it works on RDNA (I saw some asm usage of RDNA registers though testing removing them didn't make a difference for me)

hipModuleLoadData: Returned hipErrorInvalidKernelFile error is vague and fixing it is probably above my level

I tried porting it to HIP-RT 2.3 because I can't find download for earlier versions anywhere and even if I did find it would not include bitcode for Polaris (gfx803) which is what I have.

xCuri0 avatar Mar 19 '24 17:03 xCuri0

@vosen do you know where I can download the old HIP-RT version binaries that ZLUDA uses ? AMD seem to only offer the latest one

I know it probably won't work for me (Polaris bitcode likely not included) but still interested in it.

xCuri0 avatar Mar 23 '24 12:03 xCuri0

No idea, but I still have the package. I've uploaded it here: https://files.catbox.moe/ud40bj.zip

vosen avatar Mar 23 '24 15:03 vosen

@vosen So I tried and it actually runtime compiles and links properly on Polaris / gfx803 after a few small modifications to the kernels.

But like you said ZLUDA's Optix support is buggy, ModuleParser::parse_checked gives an error on compiling the 3rd CUDA PTX program (rayhit) when running https://github.com/ac-custom-shaders-patch/acc-bakeryoptix

Was hoping it would work because it was based on one of NVIDIAs samples.

xCuri0 avatar Apr 03 '24 03:04 xCuri0

it seems like this bit of PTX

.const .align 4 .b8 __cudart_i2opi_f[24] = {65, 144, 67, 60, 153, 149, 98, 219, 192, 221, 52, 245, 209, 87, 39, 252, 41, 21, 68, 78, 110, 131, 249, 162};

causes issues for the parser, it errors out endlessly after it (added some code to print it)

 err: User { error: UnrecognizedDirective { start: 6624, end: 6877 } }


 err: User { error: UnrecognizedDirective { start: 6877, end: 6971 } }


 err: User { error: UnrecognizedDirective { start: 6971, end: 7028 } }


 err: User { error: UnrecognizedDirective { start: 7028, end: 7425 } }


 err: User { error: UnrecognizedDirective { start: 7425, end: 7459 } }

continues till end of file

maybe I should open a separate issue about this issue with the PTX parser.

here's the whole PTX if you're interested https://gist.github.com/xCuri0/36207c7ca7c29e936d971c497c183cfe

xCuri0 avatar Apr 04 '24 15:04 xCuri0

Hmm, this bit of PTX looks extremally normal. The start & end offsets reported in UnrecognizedDirective are in bytes. Yes, feel free to open a separate issue, I'm not usually monitoring closed issues.

One thing to note: ZLUDA-OptiX does assume that you are running on rdna or newer (wave32) and not on something different (wave64)

vosen avatar Apr 04 '24 15:04 vosen

I've looked into ptx, the problem is unimplemented ldu instruction. In general, the UnrecognizedDirective error you are seeing is an error that comes from recovering from a parser error. The recovery is fairly simple and is not precise (but offers a clue where the problem is)

vosen avatar Apr 04 '24 16:04 vosen