libcudacxx
libcudacxx copied to clipboard
Adding `mdspan` reference implementation
- Pulls the mdspan reference implementation from branch "stable" of the kokkos repo, https://github.com/kokkos/mdspan, up to PR 172.
- Uglified internal identifiers and made some naming convention updates.
Generally speaking I am extremely concerned about the implementation complexity here.
I only looked at extends
in detail, because I did implement it previously.
Especially compile times are going to be incredibly difficult here.
Also what I am generally wondering is why we are putting this in namespace stdex. It was merged into the standard after all
Hi @miscco ! We talked a bit about extents
offline. An excellent first step would be to run the benchmarks both with the current version, and with the current version with your extents
implementation substituted in. It would also be interesting to see if your implementation performs well at lower optimization levels. (This would cover the important use case of debug builds not being horribly slow, which is important for game development.)
Also what I am generally wondering is why we are putting this in namespace stdex. It was merged into the standard after all
This is a user library (not a vendor's implementation of the Standard Library). Thus, it's not allowed to put things in namespace std
.
mdspan was merged into C++23. The reference mdspan implementation works with C++14, 17, and 20 as well. For those back-porting use cases, it's important that we don't break applications' builds, in case those applications adopt a later compiler version that provides an mdspan implementation. Thus, we don't want to collide with std::mdspan
.
@miscco I wonder if some of the complexity of the reference mdspan implementation comes from an earlier requirement to back-port to C++11 (with its much more restrictive requirements for constexpr
functions). That requirement has since been dropped. The reference mdspan implementation currently requires at least C++14. I'll file an issue in the reference mdspan implementation to investigate this.
I think the comment about extents could be simpler are fair. Part of the problem was trying to work around a number of issues with empty base optimization in MSVC and Intel compiler. MSVC did get much better though in the last couple years. That said: looking at some assembly it does look like the reference implementation compiles cleaner:
https://godbolt.org/z/Gvsfrn31o
Thinking more about this: there was a lot of work on extents, to essentially make the .extent(i) function optimize really well. Without the complication I think you end up with something which goes through some loop structures to figure out which element to access. I.e. you kinda rebuild the mapping for index to "which element from the dynamic extents array do I need to access" every time or so? But am not 100% sure.
Yeah I believe my _Dynamic_index
function is the issue, I will try to memoize that in a second static array, so there is no runtime cost
Note: with -O3 and GCC 11 it does compile into the same instruction sequence. I just told Mark, I am totally open to entertaining a whole sale replacement on our side. I think we should underpin this with some data down the line, for example we are working on using mdspan in Kokkos which then would make a whole lot of applications use it. If we collect data from that side regarding compile time/binary size/performance and from your applications we should have a good basis to make decisions.
Can we defer any significant refactoring discussion to future work/PR? I don't want to delay @youyu3's PR any more than necessary.
@jrhemstad Don't worry! These investigations should not affect the current release. Discussions continue on the reference mdspan implementation issue https://github.com/kokkos/mdspan/issues/190 .
Hey @youyu3 sorry for the last minute change, but based on some internal conversation, I think we can drop the experimental
namespace for mdspan
.
Hey @youyu3 sorry for the last minute change, but based on some internal conversation, I think we can drop the
experimental
namespace formdspan
.
Okay. Will do. Then I suppose that we should move the files out of the experimental
sub-directory as well?
It's far too much for me to review in detail, but I'm really impressed and happy with all the new unit tests! Thank you also for addressing those code comments we talked about earlier today!
I pulled commits from the span
PR. There are further changes in those files I believe.
Commit history appears to be broken. I'll try to resolve this locally and possibly issue a correction, but AFAICT there seems to be a replay of several commits.