Yutaro Iiyama
Yutaro Iiyama
Sorry @thaarres somehow GitHub didn't send me a notification when you mentioned me above or I accidentally deleted the email - anyway I only saw this issue today. How about...
I included this `-DNO_VIVADO` fix in #344
`hls_math.h` is used for `hls::exp()` in this code, and the function is called to fill a lookup table, filling controlled by a static boolean flag. So I think the usage...
Thanks for picking up this issue! Yes, with the example above it's always the third and fourth lines (GPUs 2 & 3) that have the issue. On the other hand,...
Also I realized I had forgotten to put the most basic information out: I'm seeing this effect in - JAX 0.3.14 + CUDA 11.4.0 - JAX 0.3.15 + CUDA 11.6.2....
@nouiz I'm sorry for the long silence. I was trying nccl-tests as suggested by @sudhakarsingh27 and was trying to understand what I was seeing. First, to answer your question: the...
No, with `CUDA_VISIBLE_DEVICES=0,1,2,3` I get a gradient 1e-45 for GPUs 2 and 3. The motherboard is a [SuperMicro X12DPG-OA6](https://www.supermicro.com/en/products/motherboard/x12dpg-oa6). With the recent addition of two GPUs, the topology has changed...
Hmm OK, so it may be a problem deep inside the GPU driver and the communication between the PCIe buses.. I myself use JAX exclusively, but others use PyTorch, and...