Peter Boyle
Peter Boyle
e.g. clang++ should work fine as a host compiler, Ive used it fine on Ampere
Also -- feel free to contribute a config-command and directory under Systems/GraceHopper or similar. Could save others some hassle.
I'm puzzled by that, because I thought that nvcc does two things: -- preprocess to host and device code -- run ccbin on the host sequences -- run the device...
NEONv8) AC_DEFINE([NEONV8],[1],[ARMv8 NEON]) SIMD_FLAGS='-march=armv8-a';; So if Grid is targeting NEON as it's SIMD, I'm passing -march=armv8-a to the compiler. This is likely missing from the host compiler when you configure...
You could almost certainly remove Grid from this problem as the challenge appears to be to get Eigen to work with nvcc and your host compile with -DEIGEN_DONT_VECTORIZE set. This...
Can you please i) recompile with configure flags including --enable-debug ii) rerun on a single MPI rank the same volume, using a cold start if necessary. iii) rerun it under...
Hi -- I didn't merge this request because I was worried about the comments about non-understood parameter. But I'm working on a feature branch that unifies the accelerator and non-accelerator...
The Torch BMM that looks to be mapped to oneDNN and other libraries here is: https://pytorch.org/docs/stable/generated/torch.bmm.html unless I'm mistaken, this has the contiguous packed tensor constraint. ``` input and mat2...
Can I query what is the consensus view of the status of this issue? Is there a known good MPI installed in some module on Aurora? It seems the original...