Implement P3222 and P3050
Implement P3222R0 ("Add transposed special cases for P2642 layouts"). The corresponding paper PR is https://github.com/ORNL/cpp-proposals-pub/issues/448. Add tests for previously supported cases and the new cases.
Implement P3050R2 ("Optimize linalg::conjugated for noncomplex value types") and add tests. That is, fix conjugated for non-arithmetic, non-(custom complex) types. A type T is "custom complex" if conj(T) is ADL-findable.
Each of these papers has a separate CMake option, which is documented in CMakeLists.txt. Both options default to OFF (not implemented) for now.
Fixes https://github.com/kokkos/stdBLAS/issues/267 .
Hi @dalg24 ! Thanks for your review!
I would prefer if you did not mix in the refactor/fixes with the implementation of the new feature.
It's actually impossible to pass the repository's automated premerge tests without the fixes, as the build fails.
Each commit is atomic (it builds and passes tests locally) and can be examined separately.
Can't you open another PR with the fixes only?
Can't you open another PR with the fixes only?
There are lots of fixes. They are separated into different commits.
The current state of the repo is broken; it fails to build.
@dalg24 Per your request, I've created PR #269 that only fixes the build and Standard conformance issues, without adding new features.
This PR is rebased atop PR #269, because (as mentioned before) this repository's build is currently broken, so it's impossible to pass check-in tests without the build fixes. Please merge PR #269 first.
@crtrott @dalg24 This PR is ready to review. Changes as of today:
- Dependency PR #269 has merged.
- Added separate CMake options to enable each paper's changes.
- CMake options default to
OFF(paper's changes not enabled) by default.
Other than small comment on scope of #ifdef + fix request in P3050 spec (add return type to deleted conj) looks fine to me.
@crtrott wrote:
This looks good. I am not 100% sure I like the approach for the conjugated thing but its anyway protected so I am good.
Just to clarify: P3050 is an optimization that simplifies the implementation. It needs some way to tell if a number type is not complex. It can be conservative about that test, because it's only an optimization. Implementations can always optimize internally by specializing on known conjugated_accessor patterns.
This is different than P3371R1. As we discussed yesterday, P3371R1 changes Hermitian rank-1 and rank-k updates to constrain Scalar not to have ADL-findable conj, even if the user's Scalar is a "noncomplex" number type. We talked yesterday about changing this in P3371R2 from a Constraint (the user's code fails to build) to a Precondition that imag-if-needed(alpha) equals Scalar{}.