ginkgo Data validation and invariants

We had some discussions about specifying and validating data structure invariants, so I'll list them here:

Matrix formats

All formats: values contain no NaN or Inf values
All sparse formats: column indices are within [0, num_cols) or [0, num_col_blocks) (FBCSR)
This one is debatable: All sparse formats: there are no duplicate column indices within a row (might cause issues with more advanced sparsity-manipulating kernels) except for Ell where all but one value must be zero
Coo: The entries are sorted by row index (and the row indices are within [0, num_rows))
Csr: The row pointers are non-descending, going from 0 to nnz
Sellp: Slice offsets are non-descending, slice lengths are consistent with slice offsets.
Permutation: Each index occurs exactly once in the permutation.

Solvers

All solvers: The system matrix needs to be valid
All iterative solvers: The preconditioner (generated or provided factory) needs to be valid
Cg: The system matrix need to be symmetric
Upper/LowerTrs: The system matrix needs to be upper/lower-triangular with non-zero diagonal entries

Factorizations

The output must be valid Csr matrices with non-zero diagonals

Preconditioners

Jacobi: The block pointers need to be ascending with gaps smaller than max_block_size
Jacobi: The blocks must be invertible/inverted blocks must not contain NaN or Inf
Isai: The output must be a valid Csr matrix (not much else we can validate here)

Apr 20 '21 07:04 upsj

After discussing this with @upsj, here is how I would like to approach this issue:

A new set of files validation_helpers.{cpp,hpp} inside core/components will be created.
These will contain several validation functions e.g. is_symmetric, is_row_ordered etc which take matrix data and/or indices via array pointers.
Matrices get a new member function validate_data() which makes use of the relevant validation helpers asserts to ensure correctness and throws an error otherwise.
The latter can be switched on/off for debug builds.

Apr 23 '21 11:04 greole

I believe we didn't talk about the actual error reporting so far, I could imagine multiple approaches:

validate_data() doesn't return anything and throws an exception with information on which part of the data is invalid
validate_data() doesn't return anything and causes an assertion to fail, which is akin to calling std::abort()
validate_data() returns something like a tagged union of either a success tag or a failure tag with additional information.

I would favor 1 or 3, since 2 is very limited in what kind of information can be provided. 1. is probably easiest to implement, since it relies on the possibility to nest exceptions for nesting validation failures later on. Kind of like the distinction between exception-type (C++) and optional-type (Rust) error handling.

I think we don't necessarily need 4., since this would otherwise require users to rebuild all of Ginkgo in Debug just to be able to figure out which part of their code is failing. Unless of course we call validate_data() in all apply implementations, then I would agree to disable this in Debug, but still allow users to call it manually in Release.

Apr 23 '21 11:04 upsj

Do you require all column indexes within each matrix row to be sorted/ascending or can they be in arbitrary order? I don't see this in the list above. IIRC, Hypre's AMG behaves differently for each ordering (and they require the diagonal entry to be stored first in the row).

Aug 19 '22 11:08 lahwaacz

For our algorithms that require the column indices to be sorted, there is usually a factory parameter called skip_sorting that can be used to signal that the input matrix is already sorted. Otherwise, we sort the matrix correctly. Of course, simple things like SpMV still work with unsorted indices, but also our AMGx requires sorting. Also, when you fill a matrix from matrix_data with the read function, you have to make sure that the input data is also sorted correctly.

Aug 19 '22 20:08 MarcelKoch

ginkgo ginkgo copied to clipboard

Data validation and invariants

Matrix formats

Solvers

Factorizations

Preconditioners

ginkgo
ginkgo copied to clipboard