flexible ordering
Background: for solid harmonics, Psi4 uses gaussian ordering rather than standard=cca ordering like most other open-source QC packages.
- for integration and testing, it's easy to build Psi4 to use
standard: https://github.com/psi4/psi4/blob/master/CMakeLists.txt#L193 as a cmake-configure-time choice - gau2grid can emit cca/gaussian at runtime https://github.com/dgasmith/gau2grid/blob/master/docs/source/c_api.rst?plain=1#L20-L25
- Libint2 (if ever relevant gauxc-side) can also emit standard/gaussian at runtime as of v2.8
- From email discussions it sounds like there's two areas where ordering is relevant
- XC itself, which may be "simply" passthrough from gau2grid
- sn-K where there's hand-tuned code presently fixed at standard
Again, this is a medium-term issue -- nothing that's blocking development.
@loriab I can see a few ways we could fix this:
- Add a shell ordering to
Shell- This would allow for the most flexibility (mixing ordering if one is particularly masochistic), but would likely be detrimental to GPU performance as it's just one more thing to keep track of / take up (precious) register space - Add shell ordering to
BasisSet- This is kind of nice as it localizes the tracking to a single object, but it would makeBasisSetno-longer ABI comparible withstd::vector<Shell>, which is convenient for a number of reasons (e.g. sending to GPU memory). This could be overcome, but it would likely take quite a bit of debugging, as I would need to individually check the NV/AMD backends (and I no longer have access to AMD HW) - Add shell ordering as an optional (defaulted) argument to
eval_exc_vxcand friends. This is the easiest solution, but also the most error prone - however it also comes with the benefit of only enabling shell ordering trickery for supported algorithms.
Leaning toward (3), let me know if you think that would be appropriate for the PSi4 integration.
Honestly, the most likely solution for e.g. sn-K is I'll swap the ordering to CCA in GPU memory and swap back the result before return. That should essentially be free relative to the work (esp for larger system). Depending on how benchmarks go, we can revisit native Gaussian-ordering support in the sn-K kernels (particularly once we refactor the code generator) in a separate issue if needed.
Sorry for the delay -- I think I'd better talk this over with @davpoolechem . It sounds like this might be all in hand.
Happy to report on this!
As of now, any concerns related to mismatched integral ordering are handled directly within the current Psi4-GauXC interface itself. In cases where Psi4 is compiled with psi4_SHGAUSS_ORDERING=gaussian (which is most of them, I presume), the interface constructs a Gaussian-to-CCA permutation matrix that is used to perform all necessary transformations and back-transformations of the density and exchange matrices over the course of the K build process.