kliuae

Results 2 issues of kliuae

This PR adds ROCm support for punica kernels to enable multi-LoRA on AMD GPUs. Some Punica files are slightly refactored so that the correct c++/hipcc compilers can be invoked when...

rocm

### Problem Description In ROCm 6.4.0, calling `hipMemRelease` does not appear to release the physical memory allocated on the GPU. Both `hipMemGetInfo` and `rocm-smi` report that the memory is still...

Under Investigation