qmcpack icon indicating copy to clipboard operation
qmcpack copied to clipboard

Implement SPOSetT<T> template class hierarchy

Open williamfgc opened this issue 2 years ago • 22 comments

Proposed changes

Introduce a shim template class to asses the effort to change SPOSet to SPOSet<T>. SPOSetT<T> currently does not have consumers nor tests. Testing compilation in CI and define concrete types for the template classes. The challenge is to adapt SPOSetT<T> to SPOSet current consumers. Related to #4099

What type(s) of changes does this code introduce?

Delete the items that do not apply

  • Refactoring (no functional changes, no api changes)

Does this introduce a breaking change?

  • No

What systems has this change been tested on?

sulfur

Checklist

  • Yes. This PR is up to date with current the current state of 'develop'
  • Yes. Code added or changed in the PR has been clang-formatted
  • No. This PR adds tests to cover any new code, or to catch a bug that is being fixed
  • No. Documentation has been added (if appropriate)

williamfgc avatar Jul 24 '23 19:07 williamfgc

Changed to the option suggested by @ye-luo to only concretize class type.

williamfgc avatar Jul 24 '23 20:07 williamfgc

Thinking of how to avoid adding new files. Can we make SPOSetT<T> as an actual shim layer derived from SPOSet. Once a derived class is ported. It changes base from SPOSet to SPOSetT<T>. In this way, we don't add new files except SPOSetT<T>.

ye-luo avatar Aug 24 '23 16:08 ye-luo

@ye-luo thanks for chiming in. We just thought having a separate SPOSetT hierarchy would help us with CI (at least build errors we introduce are caught) and for us to work in parallel. Once we have that we can start integrating with current SPOSet consumers without rewriting those and once done remove SPOSet and rename SPOSetT->SPOSet<T>. Thoughts?

williamfgc avatar Aug 24 '23 18:08 williamfgc

@ye-luo thanks for chiming in. We just thought having a separate SPOSetT hierarchy would help us with CI (at least build errors we introduce are caught) and for us to work in parallel. Once we have that we can start integrating with current SPOSet consumers without rewriting those and once done remove SPOSet and rename SPOSetT->SPOSet. Thoughts?

probably I didn't fully understand the current status of this PR. I saw multiple files (SPOSetT and derived classes) got added and was thinking of how to make the change mergeable without the need of porting every SPOSet derived classes before merging. By making SPOSet a base of SPOSetT probably allows us to keep ported and not yet ported classes to co-exist without any duplication. All the high level classes still just see SPOSet.

ye-luo avatar Aug 24 '23 18:08 ye-luo

Test this please

williamfgc avatar Oct 26 '23 01:10 williamfgc

Test this please

williamfgc avatar Oct 26 '23 01:10 williamfgc

Test this please

williamfgc avatar Oct 26 '23 02:10 williamfgc

Is there any chance of breaking this into separate PR's that are more reviewable? (That is, are the changes to individual classes separable, or they all linked?) Is there a change in build time as a result of the extra templates?

markdewing avatar Oct 26 '23 14:10 markdewing

Is there any chance of breaking this into separate PR's that are more reviewable? (That is, are the changes to individual classes separable, or they all linked?)

Changing SPOSet to SPOSetT<T> meant modify all derived classes and builders. Beyond that we tried to use macros to contain this change to a minimum (but still large :crying_cat_face:) number of files.

quantumsteve avatar Oct 26 '23 15:10 quantumsteve

The changes in the Numerics and Particle directories clearly can't depend on SPOSet. And a lot of the changes in QMCDrivers are just minor changes like include files.

markdewing avatar Oct 26 '23 15:10 markdewing

It looks like the "new" files have been run through clang-format that is more restrictive than the default one, or been modified in an editor that does some formatting. What I've noticed so far:

  • Copyright header gets wrapped to 80 columns
  • Comments in general get wrapped to 80 columns
  • Space gets added between comment and text

I think it would be better if the copyright headers had a consistent format in every file.

Unfortunately, these aren't new files - they're renamed and changed versions of existing files. The other changes wouldn't be a problem in a small PR, but for one this large, every change is making it harder for git to detect moved files (It uses some threshold for the number of pairs of changes to check, or a percentage of file similarity).

markdewing avatar Oct 26 '23 17:10 markdewing

cmake -G Ninja -DQMC_MPI=OFF ../

develop

$ time ninja
[646/646] Linking CXX executable bin/ppconvert

real	1m55.924s
user	64m36.053s
sys	5m4.636s

ref-add-SPOSetT

$ time ninja
[652/652] Linking CXX executable bin/ppconvert

real	2m5.956s
user	70m52.674s
sys	5m15.397s

quantumsteve avatar Oct 26 '23 17:10 quantumsteve

@markdewing thanks for the catch. The latest commit should have addressed clang-format related issues. I will double-check. I used clang-format-16. Thanks!

williamfgc avatar Oct 26 '23 23:10 williamfgc

Test this please

williamfgc avatar Oct 27 '23 01:10 williamfgc

The changes in the Numerics and Particle directories clearly can't depend on SPOSet. And a lot of the changes in QMCDrivers are just minor changes like include files.

A lot of these changes are at the SPOSet<T> consumer level (e.g. builders) and likely 'cause GitHub Actions CI was breaking as at some point we ran it in every commit the current branch has. It's probably the smallest PR we can get with such refactoring.

williamfgc avatar Oct 27 '23 13:10 williamfgc

Test this please

williamfgc avatar Oct 27 '23 14:10 williamfgc

What's going on with the MCWalkerConfiguration further bleeding into the batched drivers and mw is undesirable. I've gone out of my way to keep this from happening. We want to drop that class when the legacy drivers and estimators go away. It should never be a source for type information for non legacy code.

The explicit definition of MCPWalker from Walker instead of pulling in Walker_t from elsewhere was intended to decouple classes that were tightly bound via Walker type aliases. Then they can be built and tested without creating cascading rebuilds. There seems to be a clash now between QMCTraits and ParticleTraits which is making a bit of a mess that is just getting fixed by adding includes that shouldn't really be included.

Similar issues seem behind the many forward declarations that were successfully decoupling headers that have been reverted by this PR.

I'm not sure if we should put this in and then work to refactor away the new includes and MCPWalker type aliases or just stage this in such a way we don't add all these design regressions.

PDoakORNL avatar Oct 27 '23 23:10 PDoakORNL

Test this please

williamfgc avatar Nov 08 '23 01:11 williamfgc

full/mixed precision. This is what I would like to leave aside. by looking at all the changes. I'm surprised to see ParticleSet being template as ValueType.

This was done to satisfy the requirement to not specialize on a single function basis, but rather class level since ParticleSet is directly consumed by SPOSet and derived classes. If we start specializing it will look like lots of trailing #ifdef for handling mixed precision, and templates for real/complex. We think this is a good-enough compromise to move forward and not complicate current tasks given timelines. We didn't want to make complicated design changes (other than what we were asked) as we are relying on the current CI and many parts are not necessarily tested.

Merging this PR as it is will be a challenge for post-ECP develops as there is very limited understanding of this PR and some modifications are not necessary and needs undo. So I still would like to explore a way to make piece wise changing possible.

The PR itself is about undoing things, and there is plenty of pending things (e.g. we are still depending of #ifdef for template class definitions). I'm afraid any other attempt to make this piece wise won't add value and we we will end up with something very similar to what we have (or worse, combining current non-generic #ifdef patterns with template generic patterns). This is a good opportunity to identify pieces towards major re-architecture redesign, refactoring and add more testing. This PR can be summarized as SPOSet -> SPOSet <T> + the most straight-forward/shortest-path way to satisfy CI requirements. My concern is on trying to spend time to maximize impact given that resources are limited.

williamfgc avatar Nov 08 '23 12:11 williamfgc

Test this please

williamfgc avatar Nov 09 '23 01:11 williamfgc

Test this please

williamfgc avatar Nov 09 '23 02:11 williamfgc

@prckent @ye-luo @quantumsteve this is a new checkpoint that CI tests are passing and conflicts are resolved.

williamfgc avatar Nov 09 '23 14:11 williamfgc

Dead PR?

jtkrogel avatar Nov 13 '25 20:11 jtkrogel

As discussed with William and in the Fut. Gen. Comp. Sys. paper, I proposed using std::variant (or a wrapped std::variant, along with some additional functionality) to retrofit this hierarchy in a way that allows for polymorphic use at runtime. However, I'm unsure if this remains a priority.

correaa avatar Nov 13 '25 21:11 correaa

We learned from this PR that a top-down approach isn't friendly to developers. The idea has been picked up using bottom-up approach instead. See https://github.com/QMCPACK/qmcpack/pulls?q=is%3Apr+sposet+template+is%3Aclosed This PR can be closed.

ye-luo avatar Nov 14 '25 00:11 ye-luo

Yes, this can be closed (feels long ago). Some takeaways:

  1. This PR identifies the pieces that need to be modified in a major refactoring effort
  2. The work is what could be delivered within the timeframe of the ECP milestone. Although we had a very limited schedule and a big cliff, we delivered something functional
  3. Predates current LLMs: could be a good real-world application refactoring target using SOTA LLMs.
  4. There is room for improvement for CI and other tests since all tests passed at that time, but the PR was still not trusted.

Thanks!

williamfgc avatar Nov 14 '25 11:11 williamfgc