stdBLAS
stdBLAS copied to clipboard
P1673: `symmetric_matrix_vector_product`: bug in implementation
In here the lines starting at 442 :
if constexpr (std::is_same_v<Triangle, lower_triangle_t>) {
for (size_type j = 0; j < A.extent(1); ++j) {
for (size_type i = j; i < A.extent(0); ++i) {
const auto A_ij = A(i,j);
y(i) += A_ij * x(j);
y(j) += A_ij * x(i);
}
}
}
else {
for (size_type j = 0; j < A.extent(1); ++j) {
for (size_type i = 0; i <= j; ++i) {
const auto A_ij = A(i,j);
y(i) += A_ij * x(j);
y(j) += A_ij * x(i);
}
}
}
should be:
if constexpr (std::is_same_v<Triangle, lower_triangle_t>) {
for (size_type j = 0; j < A.extent(1); ++j) {
for (size_type i = j; i < A.extent(0); ++i) {
const auto A_ij = A(i,j);
y(i) += A_ij * x(j);
if(i != j){
y(j) += A_ij * x(i);
}
}
}
}
else {
for (size_type j = 0; j < A.extent(1); ++j) {
for (size_type i = 0; i <= j; ++i) {
const auto A_ij = A(i,j);
y(i) += A_ij * x(j);
if(i != j){
y(j) += A_ij * x(i);
}
}
}
}
Otherwise we count twice for that element.
Note the spec says: For i in the domain of y, the mathematical expression for the algorithm is y[i] = the sum of A[i,j] * x[j] for all j in the triangle of A specified by t, plus the sum of A[j,i] * x[j] for all j not equal to i such that j,i is in the domain of A but not in the triangle of A specified by t.
Most likely also the updating overload has same issue, have not checked that yet.
@mhoemmen That one should be fixed by my PR