Use gsl::index consistently
We should be using gsl::index consistently throughout enrico.
@RonRahaman
In the Nek drivers, I see some cases where gsl::index might not be the best choice. In some loops, we're ultimately using the loop counter to index a Fortran array. If the loop counter were a size_t or ptrdiff_t, then we'd have a possible narrowing conversion in the Nek API calls, which use integer indices. I've tried to clean that up in #60
Here are some suggestions based on the Core Guidelines and a look at what's happening in ENRICO:
Flagging Signedness Mismatches of Container Indices
The Core Guidelines recommend using signed indices. Likewise. the Core Guidelines assume that gsl::index is implemented as a signed type, ptrdiff_t.
The built-in array uses signed subscripts. The standard-library containers use unsigned subscripts. Thus, no perfect and fully compatible solution is possible (unless and until the standard-library containers change to use signed subscripts someday in the future). Given the known problems with unsigned and signed/unsigned mixtures, better stick to (signed) integers of a sufficient size, which is guaranteed by gsl::index.
Our vendor code represents a mixture of unsigned and signed data types:
- Our current version of GSL Lite implements
gsl::indexassize_t, which is unsigned. - The OpenMC C API expects indices to be
int32_t(which are signed), - The Nek5000 C API will also expect indices to be
int32_t.
The core guidelines recommend not flagging signedness mismatching due to indices. This is a good solution for us.
(To avoid noise) Do not flag on a mixed signed/unsigned comparison where one of the arguments is sizeof or a call to container .size() and the other is ptrdiff_t.
Narrowing Conversions for Container Indices
Declaring indices as gsl::index would require some narrowing conversions when we use vendor code. Recall that:
- The Core Guidlines assumes that
gsl::indexis aptrdiff_t - Our current version of GSL Lite implements
gsl::indexassize_t - The OpenMC and Nek5000 C APIs expect indices to be
int32_t, which are smaller thanptrdiff_tandsize_t.
In light of this, we have a couple of options. Not sure which is best, but I'm slightly in favor of the second one:
- Always declare indices as
gsl::index. When necssary for vendor APIs, usegsl::narrowto cast asint32_t. - Declare indices on a case-by-case basis (either as
gsl::indexorint32_t) to avoid casting.
Variable Types for Container Sizes and Other Sizes
Again, we've got some variability in our code:
- The
size_typefor most standard library containers ends up beingsize_t - xtensor expects shapes to be
size_t - Given that OpenMC uses xtensors and standard library containers, sizes are usually
size_t - The sizes of Nek5000 arrays are Fortran
integer, which is usually a 32-bit. This is smaller thansize_t. - In MPI, comm size, etc., are returned as
int, which are smaller thansize_t
However, I don't imagine this will result in narrowing conversions. The Nek array sizes are effectively read-only, since Nek uses static arrays. And when we're setting up the MPI comm split, we don't really use any of the other container sizes.
So I think we should:
- Declare the Nek sizes as
int32_t - Declare MPI sizes as
int. - Declare everything else as
size_t.
@paulromano and I had some Slack discussions, and I wanted to summarize here. What we settled on was:
- Always declare indices as
gsl::index - Use
gsl::narrowwhen passing indices as arguments to Nek and OpenMC API calls - Always declare sizes as
std::size_t
With respect to the Nek code, this is a course-correction on my part. Previously ( https://github.com/enrico-dev/enrico/pull/60 ), I had declared indices and sizes as int32_t to match the size of the default size of a Fortran integer datatype. In that PR, I had introduced code like this:
for (int32_t i = 0; i < nelt_; ++i) {
local_elem_temperatures[i] = this->temperature_at(i + 1);
}
where nelt_ is also declared as int32_t; and where the temperature_at member ultimately calls a Fortran API routine, nek_get_local_elem_temperature, that expects a Fortran integer as an index. However, this code would be difficult to maintain if we changed nelt_ and the other bounds to larger datatypes. We'd have to change all the declarations of indices.
Instead, I plan to introduce something like:
for (gsl::index i = 0; i < nelt_; ++i) {
local_elem_temperatures[i] = this->temperature_at(gsl::narrow<nek_int_t>(i + 1));
}
where nek_int_t is declared as a configurable macro, to match whatever size integer that Nek is compiled with. This can be a value that we set or discover in CMake.
using nek_int_t = FORTRAN_DEFAULT_INT_KIND;
Thanks for the clarifications! This sounds good to me.