Cory Bloor comments

Results 139 comments of


                                            Cory Bloor

making archspec GPU-aware

LLVM has a pair of simple utilities to determine the GPU architecture: [amdgpu-arch](https://github.com/llvm/llvm-project/tree/009048810ac635a7ad6c5f788d537172418b6054/clang/tools/amdgpu-arch) [nvptx-arch](https://github.com/llvm/llvm-project/tree/009048810ac635a7ad6c5f788d537172418b6054/clang/tools/nvptx-arch) nvptx-arch depends on libcuda. amdgpu-arch depends on libhsa-runtime64.

[Issue]: build is with libc++ fails with `fatal error: 'experimental/filesystem' file not found`

@tcgu-amd, most of the math libraries are using C++17. While libraries shouldn't upgrade their standards version needlessly, it would be appropriate for hipSPARSE given that they are using C++17 features...

[Bug]: rocBLAS error: Cannot read TensileLibrary.dat: No such file or directory

Hi @slipperyslipped. Your GPU uses the gfx1031 instruction set, but the binaries distributed by AMD are not built for that architecture as it is not officially supported. However, the gfx1030...

[Bug]: rocBLAS error: Cannot read TensileLibrary.dat: No such file or directory

I'm not an expert on PyTorch, but the gfx1013 ISA is a superset of the gfx1010 ISA. You can set `export HSA_OVERRIDE_GFX_VERSION=10.1.0` and it will probably work. With that said,...

[Bug]: rocBLAS error: Cannot read TensileLibrary.dat: No such file or directory

Thanks @ulyssesrr. That's a great analysis of the problem. It's perhaps worth noting that the OS-provided rocBLAS package on Debian 13 (Testing/Trixie) and the upcoming Ubuntu 23.10 (Mantic Minotaur) builds...

[Bug]: rocBLAS error: Cannot read TensileLibrary.dat: No such file or directory

> GPU is a 7800 XT. > > Stack from running a basic PyTorch example under GDB is shown below. I did have to override gfx version to either `11.0.0`...

[Bug]: rocBLAS error: Cannot read TensileLibrary.dat: No such file or directory

@NaturalHate, build for gfx1030 and run with `export HSA_OVERRIDE_GFX_VERSION=10.3.0` set in your environment.

Move architecture depending CMake variables into toolchain file

Surprisingly, yes. At least, I think so. We're using CMake features for some of these, but I think we do want to better leverage toolchain files. I'm open to differing...

[Feature]: Optional use of joblib

Ideally, the parallelism would be managed through the build system. It is inefficient to have two resource pools (make/ninja and joblib) manage the same set of resources (CPU cores, memory)....

Cherry-pick to release/rocm-rel-6.3: Add metadata info and pointer to docs to the readme file (#193)

Unfortunately, we're past the window for ROCm 6.3 updates.