Peter Hawkins
Peter Hawkins
(I hope this is fixed in JAX head at this point. Please let me know if it isn't...)
Note that change has already been made at JAX head.
Yes and no. With an updated repro to account for MHLO changes: ``` from iree.compiler import compile_str CODE = """ module { func.func @main(%arg0: tensor, %arg1: tensor) -> tensor {...
Sorry this dropped off my list of things to do. Apologies! I had missed that multiprocess support isn't hooked into ROCM/RCCL (which is what `nccl_unique_id_callback` does). I'd guess that's a...
I suspect the issue is we always run `pytest tests` not `pytest`. But it seems like a reasonable thing to want to do.
What version of CUDA do you have installed?
I'm unable to reproduce this on: a) a V100 cloud machine b) a T4 cloud machine with CUDA 11.7 c) a desktop 1080 GPU with CUDA 11.4 So there must...
I can reproduce this with CUDA 11.1 but not with CUDA 11.2 or newer. This at least suggests a workaround you can use: update to a newer CUDA release. I'm...
@sudhakarsingh27 Could you please determine whether this is a known CuFFT bug and if so whether there are workarounds we might use on the JAX side to get correct behavior...
I assume you mean "Plans with primes larger than 127 in FFT size decomposition or FFT size being a prime number bigger than 4093 do not perform calculations on second...