Anton Smirnov comments

Results 213 comments of


                                            Anton Smirnov

Update ROCSparse for Julia v1.10

@amontoison, I've sent you an invite to be able to merge PRs. I currently don't have access to AMD GPUs and therefore not working on AMDGPU.jl. So feel free to...

MI300X (gfx942) support for broadcast operations

Probably because of Julia's 1.10 LLVM version, which is 15, but gfx942 officially was added in LLVM 17 IIUC: https://github.com/llvm/llvm-project/commit/9d0572797233857397f3fdc35fffcfb490354f56 You can try Julia 1.11 early release (which has LLVM...

MI300X (gfx942) support for broadcast operations

AMDGPU 0.9 now supports Julia 1.11 and maybe MI300X. Just make sure to launch Julia with `JULIA_LLVM_ARGS="-opaque-pointers"` env variable set to use system-wide ROCm device libraries instead of our patched...

MI300X (gfx942) support for broadcast operations

We then need Julia 1.12, which has LLVM 17 (1.11 has LLVM 16). I haven't tested it yet, as 1.11 itself is still in beta, but I can take a...

MI300X (gfx942) support for broadcast operations

AMDGPU.jl needs to account for changes in Julia 1.12, I haven't done that yet

Error triggered by synchronize()

It means there's an exception that's triggered by one of the kernels you run. Sadly at the moment it doesn't say much (just `GPU Kernel Exception`), I had to comment...

Error triggered by synchronize()

To make it easier, I've pushed a branch `pxl-th/exception` that has proper exception reporting, so you can use it for debugging

Multithreading code hangs

Mixing default and non-default streams in `hip*Async` functions seems to cause hangs. Here's C++ reproducer: ```cpp #include #include #include __global__ void vectorAdd(int *a, int *b, int numElements) { int i...

Multithreading code hangs

Respective issue in HIP: https://github.com/ROCm/HIP/issues/3370#issuecomment-1970744166

Missing license?

License is MIT, feel free to use it any way. Weights I think are under the original [Apache License 2.0](https://github.com/lukemelas/EfficientNet-PyTorch/blob/master/LICENSE).