Jithun Nair
Jithun Nair
cc @sunway513 @rraminen @jithunnair-amd
@jeffra @tjruwase
Noting here for context: this comes from PR https://github.com/microsoft/DeepSpeed/pull/2047 Some relevant discussion on why this was added: https://github.com/microsoft/DeepSpeed/pull/2047#discussion_r1401157111 It looks like some CPU vectorized instructions are used in https://github.com/microsoft/DeepSpeed/blob/master/csrc/includes/simd.h#L19 AVX256...
@malfet @mthrok It's not clear to me at all if these CI failures need any action from our end, since the failures are CUDA related. Can you please let us...
@mthrok I'm taking a look at refactoring the CMake files to address the review comments
@mthrok @malfet Can you please review?
@mthrok This PR introduces a dependency on `libomp.so` which is available with a standard ROCm installation in `/opt/rocm/llvm/lib/libomp.so`. However, since it's not packaged with the torchaudio wheel currently, and is...
cc @atalman @seemethere
@BLOrange-AMD Actually, even this PR might be unnecessary based on the latest flow due to [this PR ](https://github.com/pytorch/pytorch.github.io/pull/1232) which should automatically extract the ROCm version from release matrix. Once your...
@BLOrange-AMD This automatically filed PR should do the job: https://github.com/pytorch/pytorch.github.io/pull/1296