Ye Luo
Ye Luo
> @ye-luo Do you mean batched driver with CPU build on Perlmutter? I thought you were using the offload build on Polaris. Actually no. So please use batched driver with...
@djstaros the energy drops are very suspicious. Could you test two runs with batched driver in CPU build using the latestest QMCPACK develop branch on perlmutter. In one run, add...
`crowd_serialize_walkers=yes` forces the batched driver to call single walker APIs internally. On CPU, the performance difference should be minimal. What I feel here is a bug in the multi-walker specialized...
> Yes, agreed. The multi-walker implementation needs looking into. Beyond that, I propose we need a PR to set the default to "yes" for cpu runs. > > If there...
It seems to me caused by the broken T-move fixed in https://github.com/QMCPACK/qmcpack/pull/4902 @djstaros will you be able to rerun the reproducer using `/soft/applications/qmcpack/develop-20240425` on polaris
@prckent agreed. For the moment, getting the calculation on the GPU and sorting out data movement take precedence.
Test this please
Test this please
Test this please
@kgasperich please have a final pass to see if there is anything you would like to change before merging.