Cory Bloor

Results 139 comments of Cory Bloor

LLVM has a pair of simple utilities to determine the GPU architecture: [amdgpu-arch](https://github.com/llvm/llvm-project/tree/009048810ac635a7ad6c5f788d537172418b6054/clang/tools/amdgpu-arch) [nvptx-arch](https://github.com/llvm/llvm-project/tree/009048810ac635a7ad6c5f788d537172418b6054/clang/tools/nvptx-arch) nvptx-arch depends on libcuda. amdgpu-arch depends on libhsa-runtime64.

@tcgu-amd, most of the math libraries are using C++17. While libraries shouldn't upgrade their standards version needlessly, it would be appropriate for hipSPARSE given that they are using C++17 features...

Hi @slipperyslipped. Your GPU uses the gfx1031 instruction set, but the binaries distributed by AMD are not built for that architecture as it is not officially supported. However, the gfx1030...

I'm not an expert on PyTorch, but the gfx1013 ISA is a superset of the gfx1010 ISA. You can set `export HSA_OVERRIDE_GFX_VERSION=10.1.0` and it will probably work. With that said,...

Thanks @ulyssesrr. That's a great analysis of the problem. It's perhaps worth noting that the OS-provided rocBLAS package on Debian 13 (Testing/Trixie) and the upcoming Ubuntu 23.10 (Mantic Minotaur) builds...

> GPU is a 7800 XT. > > Stack from running a basic PyTorch example under GDB is shown below. I did have to override gfx version to either `11.0.0`...

@NaturalHate, build for gfx1030 and run with `export HSA_OVERRIDE_GFX_VERSION=10.3.0` set in your environment.

Surprisingly, yes. At least, I think so. We're using CMake features for some of these, but I think we do want to better leverage toolchain files. I'm open to differing...

Ideally, the parallelism would be managed through the build system. It is inefficient to have two resource pools (make/ninja and joblib) manage the same set of resources (CPU cores, memory)....