Peter Boyle
Peter Boyle
Source: reproducer_IPC_bandwidth.cpp Problem: zeMemOpenIpcHandle and zeMemGetIpcHandle do not return a handle that can be copied by value between processes and used in a distinct process using either MPI or any...
Not so much an issue, as a comment/recommendation for future evolution. https://arxiv.org/abs/1711.04883 there are significant (10x) gains possible under Intel Omni-Path, and a study is linked. Hope you find useful.
https://wandbox.org/permlink/tzssJza6R9XnqANw https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80652 Getting Travis fails under gcc-5 for Test_simd, now that I added more comprehensive testing to the CI test suite. The limitations of Travis runtime limits & weak cores...
We need a way to control how many bits and scientific vs. decimal notation in all serialised floating point numbers.
Reminder to self to fix this.
Benchmark_dwf_fp32 detects incorrect results under ROCM 5.7 Works find under ROCM 5.3, but recent move by ORNL from 5.3 default to 5.7 default breaks Grid. I suspect compiler bug and...
Hi there, compared to torch.BMM which as far as I can tell has constraints on the arrays A/B/C being contiguous and same size for each entry in the batch, OneMKL...