Sudhakar Singh

Results 70 comments of Sudhakar Singh

@Numeri is this resolved?

@qixuanf was this resolved?

Closing since no response. (feel free to open to again if the issue isn't resolved at your end)

Rerun the repro code as follows (on google colab). It seems like the `loop` vs the `lax_f_scan` perform similarly. (Although there is a perf difference b/w CPU vs GPU but...

@yiiyama maybe you could also try debugging with [nccl-tests](https://github.com/NVIDIA/nccl-tests)

@dionhaefner Was this resolved? Do you still need help?

@rwightman can we consider this resolved?