qmcpack icon indicating copy to clipboard operation
qmcpack copied to clipboard

Added new timers to help with optimization scaling.

Open jptowns opened this issue 3 years ago • 3 comments

Proposed changes

Added new timers to better understand where time is spent in optimization.

What type(s) of changes does this code introduce?

  • Refactoring (no functional changes, no api changes)

Does this introduce a breaking change?

  • No

What systems has this change been tested on?

RHEL8 intel workstation

Checklist

  • Yes This PR is up to date with current the current state of 'develop'
  • Yes Code added or changed in the PR has been clang-formatted -No. This PR adds tests to cover any new code, or to catch a bug that is being fixed -No. Documentation has been added (if appropriate)

jptowns avatar Sep 14 '22 22:09 jptowns

It looks like there are three issues being addressed in the PR:

  1. There are timers in the batched version that are not present in the legacy version (QMCCostFunction)
  2. There are timers missing in some driver paths (one_shift_run) (in both legacy and batched versions)
  3. Inverting the overlap matrix needs a timer (in both the legacy and batched versions)

The issue with the fix to (1) is that it is done differently in the legacy version (buildmat_timer in QMCFixedSampleLinearOptimize) vs. the batched version (fill_timer in QMCCostFunctionBatched) Related is the question of whether it's better to put the timer in the caller or the callee. (In the case it only covers a single call from the caller. If it covers multiple statements, then it has to go in the caller). If the timer is in the callee, then it eases the issue in (2) where the timer calls get missed in some paths. It would work well for the eigenvalue solve if the timer is moved to LinearMethod. (getNonLinearRescale probably doesn't take much time, so it would be okay if it were outside the eigenvalue timer). This doesn't work so well for the invert timer, since the code is fairly general and used elsewhere - that would need to stay in the caller.

markdewing avatar Sep 15 '22 21:09 markdewing

Mark, I agree with your assessment that there are a lot of inconsistencies. But I am not sure how to proceed. Are you suggesting that anything short of making batched/non-batched timers consistent is not worthwhile? Guidance from someone who knows how it "should work" would be welcome here.

jptowns avatar Sep 28 '22 15:09 jptowns

The fillOverlapHamiltonianMatrices timer should be moved to QMCCostFunction to match the batched version. That would fix the one case where the same timer is implemented differently between legacy and batched code.

markdewing avatar Sep 28 '22 16:09 markdewing

Test this please

markdewing avatar Oct 24 '22 16:10 markdewing