ginkgo
ginkgo copied to clipboard
Data validation and invariants
We had some discussions about specifying and validating data structure invariants, so I'll list them here:
Matrix formats
- All formats: values contain no NaN or Inf values
- All sparse formats: column indices are within [0, num_cols) or [0, num_col_blocks) (FBCSR)
- This one is debatable: All sparse formats: there are no duplicate column indices within a row (might cause issues with more advanced sparsity-manipulating kernels) except for Ell where all but one value must be zero
- Coo: The entries are sorted by row index (and the row indices are within [0, num_rows))
- Csr: The row pointers are non-descending, going from 0 to nnz
- Sellp: Slice offsets are non-descending, slice lengths are consistent with slice offsets.
- Permutation: Each index occurs exactly once in the permutation.
Solvers
- All solvers: The system matrix needs to be valid
- All iterative solvers: The preconditioner (generated or provided factory) needs to be valid
- Cg: The system matrix need to be symmetric
- Upper/LowerTrs: The system matrix needs to be upper/lower-triangular with non-zero diagonal entries
Factorizations
- The output must be valid Csr matrices with non-zero diagonals
Preconditioners
- Jacobi: The block pointers need to be ascending with gaps smaller than
max_block_size
- Jacobi: The blocks must be invertible/inverted blocks must not contain NaN or Inf
- Isai: The output must be a valid Csr matrix (not much else we can validate here)
After discussing this with @upsj, here is how I would like to approach this issue:
- A new set of files
validation_helpers.{cpp,hpp}
insidecore/components
will be created. - These will contain several validation functions e.g.
is_symmetric
,is_row_ordered
etc which take matrix data and/or indices via array pointers. - Matrices get a new member function
validate_data()
which makes use of the relevant validation helpers asserts to ensure correctness and throws an error otherwise. - The latter can be switched on/off for debug builds.
I believe we didn't talk about the actual error reporting so far, I could imagine multiple approaches:
-
validate_data()
doesn't return anything and throws an exception with information on which part of the data is invalid -
validate_data()
doesn't return anything and causes an assertion to fail, which is akin to callingstd::abort()
-
validate_data()
returns something like a tagged union of either asuccess
tag or afailure
tag with additional information.
I would favor 1 or 3, since 2 is very limited in what kind of information can be provided. 1. is probably easiest to implement, since it relies on the possibility to nest exceptions for nesting validation failures later on. Kind of like the distinction between exception-type (C++) and optional-type (Rust) error handling.
I think we don't necessarily need 4., since this would otherwise require users to rebuild all of Ginkgo in Debug just to be able to figure out which part of their code is failing. Unless of course we call validate_data()
in all apply
implementations, then I would agree to disable this in Debug, but still allow users to call it manually in Release
.
Do you require all column indexes within each matrix row to be sorted/ascending or can they be in arbitrary order? I don't see this in the list above. IIRC, Hypre's AMG behaves differently for each ordering (and they require the diagonal entry to be stored first in the row).
For our algorithms that require the column indices to be sorted, there is usually a factory parameter called skip_sorting
that can be used to signal that the input matrix is already sorted. Otherwise, we sort the matrix correctly. Of course, simple things like SpMV still work with unsorted indices, but also our AMGx requires sorting. Also, when you fill a matrix from matrix_data
with the read
function, you have to make sure that the input data is also sorted correctly.