stdBLAS
stdBLAS copied to clipboard
P1673: `vector_sum_of_squares`: Rescaling algorithm may not be correct
@mhoemmen I was looking at this and it seems to me this is wrong?
Scalar scale = init.scaling_factor;
Scalar ssq = init.scaled_sum_of_squares;
for (extents<>::size_type i = 0; i < x.extent(0); ++i) {
if (abs(x(i)) != 0.0) {
const auto absxi = abs(x(i));
const auto quotient = scale / absxi;
if (scale < absxi) {
ssq = Scalar(1.0) + ssq * quotient * quotient;
scale = absxi;
}
else {
ssq = ssq + quotient * quotient;
}
}
}
-
why does scale change along the way?
-
also, shouldn't
scale / absxi;instead beabsxi/scale? Looking at the spec, section 16.9.2.7:Constraints: For all i in the domain of v, and for f and ssq of type T, the expression ssq = ssq + (abs(x[i]) / f)*(abs(x[i]) / f) is well formed.It says
abs(x[i])/fshould be well-formed, but notf/abs(x[i]). -
also, if things are correct, in theory one should have that:
scaling_factor * scaling_factor * scaled_sum_of_squares = sum of squares of abs(x[i]) plus init.scaling_factor * init.scaling_factor * init.scaled_sum_of_squares.see dlassq. But that does not work for the impl above.
So i tried something on my end:
{
using std::abs;
using std::max;
T scale = init.scaling_factor;
for (std::size_t i = 0; i < x.extent(0); ++i) {
scale = max(scale, abs(x(i)));
}
T ssq = (init.scaling_factor*init.scaling_factor*init.scaled_sum_of_squares)/(scale*scale);
T s= {};
for (std::size_t i = 0; i < x.extent(0); ++i) {
const auto absxi = abs(x(i));
const auto quotient = absxi/scale;
ssq = ssq + quotient * quotient;
s += absxi*absxi;
}
std::experimental::linalg::sum_of_squares_result<T> result;
result.scaled_sum_of_squares = ssq;
result.scaling_factor = scale;
// verify that things are consistent according to definition
const auto lhs = scale*scale*ssq;
const auto rhs = s+init.scaling_factor*init.scaling_factor*init.scaled_sum_of_squares;
std::cout << lhs << " " << rhs << '\n';
EXPECT_NEAR(lhs, rhs, 1e-9);
}
and this works, in the sense that it satisfies:
scaling_factor * scaling_factor * scaled_sum_of_squares = sum of squares of abs(x[i]) plus init.scaling_factor *
init.scaling_factor * init.scaled_sum_of_squares.
Thanks for pointing this out! This relates to the discussion here. There are two separate questions here.
- Is the rescaling algorithm correct?
- We should only do rescaling for floating-point (and complex-of-floating-point) types.
I'll rename this issue to ask the first question, and file a separate issue for the second question. Thanks!