enrico icon indicating copy to clipboard operation
enrico copied to clipboard

Use gsl::index consistently

Open aprilnovak opened this issue 6 years ago • 4 comments

We should be using gsl::index consistently throughout enrico.

@RonRahaman

aprilnovak avatar Aug 07 '19 02:08 aprilnovak

In the Nek drivers, I see some cases where gsl::index might not be the best choice. In some loops, we're ultimately using the loop counter to index a Fortran array. If the loop counter were a size_t or ptrdiff_t, then we'd have a possible narrowing conversion in the Nek API calls, which use integer indices. I've tried to clean that up in #60

RonRahaman avatar Aug 07 '19 03:08 RonRahaman

Here are some suggestions based on the Core Guidelines and a look at what's happening in ENRICO:

Flagging Signedness Mismatches of Container Indices

The Core Guidelines recommend using signed indices. Likewise. the Core Guidelines assume that gsl::index is implemented as a signed type, ptrdiff_t.

The built-in array uses signed subscripts. The standard-library containers use unsigned subscripts. Thus, no perfect and fully compatible solution is possible (unless and until the standard-library containers change to use signed subscripts someday in the future). Given the known problems with unsigned and signed/unsigned mixtures, better stick to (signed) integers of a sufficient size, which is guaranteed by gsl::index.

Our vendor code represents a mixture of unsigned and signed data types:

  • Our current version of GSL Lite implements gsl::index as size_t, which is unsigned.
  • The OpenMC C API expects indices to be int32_t (which are signed),
  • The Nek5000 C API will also expect indices to be int32_t.

The core guidelines recommend not flagging signedness mismatching due to indices. This is a good solution for us.

(To avoid noise) Do not flag on a mixed signed/unsigned comparison where one of the arguments is sizeof or a call to container .size() and the other is ptrdiff_t.

Narrowing Conversions for Container Indices

Declaring indices as gsl::index would require some narrowing conversions when we use vendor code. Recall that:

  • The Core Guidlines assumes that gsl::index is a ptrdiff_t
  • Our current version of GSL Lite implements gsl::index as size_t
  • The OpenMC and Nek5000 C APIs expect indices to be int32_t, which are smaller than ptrdiff_t and size_t.

In light of this, we have a couple of options. Not sure which is best, but I'm slightly in favor of the second one:

  • Always declare indices as gsl::index. When necssary for vendor APIs, use gsl::narrow to cast as int32_t.
  • Declare indices on a case-by-case basis (either as gsl::index or int32_t) to avoid casting.

Variable Types for Container Sizes and Other Sizes

Again, we've got some variability in our code:

  • The size_type for most standard library containers ends up being size_t
  • xtensor expects shapes to be size_t
  • Given that OpenMC uses xtensors and standard library containers, sizes are usually size_t
  • The sizes of Nek5000 arrays are Fortran integer, which is usually a 32-bit. This is smaller than size_t.
  • In MPI, comm size, etc., are returned as int, which are smaller than size_t

However, I don't imagine this will result in narrowing conversions. The Nek array sizes are effectively read-only, since Nek uses static arrays. And when we're setting up the MPI comm split, we don't really use any of the other container sizes.

So I think we should:

  • Declare the Nek sizes as int32_t
  • Declare MPI sizes as int.
  • Declare everything else as size_t.

RonRahaman avatar Aug 08 '19 16:08 RonRahaman

@paulromano and I had some Slack discussions, and I wanted to summarize here. What we settled on was:

  1. Always declare indices as gsl::index
  2. Use gsl::narrow when passing indices as arguments to Nek and OpenMC API calls
  3. Always declare sizes as std::size_t

With respect to the Nek code, this is a course-correction on my part. Previously ( https://github.com/enrico-dev/enrico/pull/60 ), I had declared indices and sizes as int32_t to match the size of the default size of a Fortran integer datatype. In that PR, I had introduced code like this:

  for (int32_t i = 0; i < nelt_; ++i) {
    local_elem_temperatures[i] = this->temperature_at(i + 1);
  }

where nelt_ is also declared as int32_t; and where the temperature_at member ultimately calls a Fortran API routine, nek_get_local_elem_temperature, that expects a Fortran integer as an index. However, this code would be difficult to maintain if we changed nelt_ and the other bounds to larger datatypes. We'd have to change all the declarations of indices.

Instead, I plan to introduce something like:

for (gsl::index i = 0; i < nelt_; ++i) {
    local_elem_temperatures[i] = this->temperature_at(gsl::narrow<nek_int_t>(i + 1));
  }

where nek_int_t is declared as a configurable macro, to match whatever size integer that Nek is compiled with. This can be a value that we set or discover in CMake.

using nek_int_t = FORTRAN_DEFAULT_INT_KIND;

RonRahaman avatar Aug 09 '19 14:08 RonRahaman

Thanks for the clarifications! This sounds good to me.

aprilnovak avatar Aug 11 '19 22:08 aprilnovak