Engininja2
Engininja2
That sucks. I thought I had figured it out. I'm using a 5700XT with rocblas and pytorch 2.0.1 recompiled on ROCm 5.4.3, so it's possible it's avoiding something else that...
@ardfork can you try clearing the contents of the exllama_ext cache? In my case it was in ~/.cache/torch_extensions/py311_cpu/exllama_ext I tried first running exllama without the patch and with the cache...
Changing `extra_cuda_cflags` caused ninja to compile everything again too, so I don't think there's a need to change q4_mlp this time. I raised an issue with hipamd for h2rcp(). https://github.com/ROCm-Developer-Tools/clr/issues/8
I went and added a newline to q4_mlp.cu anyways in case someone using exllama downstream is using their own code for loading the extension. I removed hip defaulting to `no_half2`...
What's the output of `ldd torch/lib/libtorch_hip.so | grep hipblas`?
Did you build Pytorch with `USE_FBGEMM=OFF`? Maybe exllama should start linking to hipblas directly. It looks like the only part of torch itself that needs hipblas is FBGEMM and that's...
I think AMD is working on WSL support but it's not public yet. You either need to use Linux with direct access to the GPU, or the HIP SDK for...