Peter Boyle
Peter Boyle
that was enough to go on for me to eyeball at least one error.
More later - I'll try and patch develop.
Sorry - reviewed again and the code looks right. Darn it...
Can you give the complete call tree that is failing? From: chulwoo1 ***@***.***> Reply to: paboyle/Grid ***@***.***> Date: Thursday, 19 August 2021 at 17:48 To: paboyle/Grid ***@***.***> Cc: Peter Boyle...
I've run on Spock and doing well on Benchmark_ITT and Benchmark_dwf_fp32. Added the systems/Spock directory with compile and run scripts. Getting 1.3TF/s on MI100.
Also get 4TF/s on a whole Spock node, 4x MI-100.
Will take a look; normally I expect this region to be live the whole run time of application - especially given the huge page issues etc...
I'm using: ``` ../configure\ --prefix /ccs/home/paboyle/prefix \ --enable-precision=double\ --enable-unified=yes \ --enable-accelerator=cuda \ --enable-summit \ --enable-comms=mpi \ --enable-simd=GPU \ CXX=nvcc \ CXXFLAGS="-ccbin mpicxx -gencode arch=compute_70,code=sm_70 -I/ccs/home/paboyle/prefix/include/ -std=c++11" \ LDFLAGS=-L/ccs/home/paboyle/prefix/lib/ ``` on...
Not hours - I'm used to ~20 minutes with make -j which isn't great but not intolerable.
wow.... and I thought power 9 was bad.... You are using parallel make?