LSR1Operator(1) fails the secant equation
Hello all,
I did test the LSR1 operators for myself, and it seems that there is a problem with the LSR1Operator of dimension 1. In my work, I generate automatically several LSR1Operators, and some of them are of dimension 1.
Here is an example to replicate the issue:
n = 1
B = LSR1Operator(n)
r = 0.0
for i = 1:100
global r
s = rand(n)
y = rand(n)
push!(B, s, y)
r += norm(B * s - y) #secant equation
end
r #166.39...
If you replace n=1 by n=2 then r = 1e-14.
The problem does not occur with the LBFGSOperators.
Is every update accepted? https://github.com/JuliaSmoothOptimizers/LinearOperators.jl/blob/main/src/lsr1.jl#L150
The conditions for the update are fulfilled in the code above.
The next piece of code use a loop which guarantee which garantees that the conditons of LSR1 are satisfied before the push!.
The strange thing is that in some cases the update happens, and in other cases it does not happen.
For example the pair (s,y) ,
B = LSR1Operator(1)
s = [0.5809871055619441]
y = [0.8263178027435335]
push!(B, s, y) #1×1 Matrix{Float64}: 1.4222653047425207
produces an updated 1x1 matrix.
To extend the analysis, I tried the push! in a loop similar to the previous one.
n = 1
r = 0.0
cpt = 0
cpt_approx_null = 0
for i = 1:100
B = LSR1Operator(n)
global r
global cpt
global cpt_approx_null
s = rand(n)
y = rand(n)
ϵ = eps(eltype(y))
ymBs = y - B * s
well_defined(ymBs, s) = abs(dot(ymBs, s)) ≥ ϵ + ϵ * norm(ymBs) * norm(s, 2)
sufficient_curvature(s, y) = abs(dot(y, s)) >= ϵ * norm(y, 2) * norm(s, 2)
scaling_factor(s, y) = dot(y, s) / dot(y, y)
scaling_condition(s, y) = norm((y - s) / scaling_factor(s, y), 2) >= ϵ * norm(y, 2) * norm(s, 2)
if well_defined(ymBs, s) && scaling_condition(s, y) && sufficient_curvature(s, y)
push!(B, s, y)
cpt += 1
ymBs = y - B*s
norm_ymBs = norm(ymBs, 2)
# @show norm_ymBs
r += norm_ymBs
if isapprox(norm_ymBs, 0., atol=1e-10)
cpt_approx_null += 1
end
end
# println(Matrix(B))
end
r, cpt, cpt_approx_null # (22.107767681026328, 100, 29)
Usually, every pair s,y should update the virgin LSR1 operator, but only 29 updates satisfy the secant equation.
I have new examples of failures for LSR1Operator and an explanation of why they fail.
lsr1 = LSR1Operator(2)
y = [-5., -5.]
s = [-1., -1.]
push!(lsr1, s ,y)
Matrix(lsr1) # indentity matrix
The vectors y and s are collinear, I tried other collinear vectors to verify
y = [-5., -4.]
s = [-1.25, -1.]
push!(lsr1, s ,y)
Matrix(lsr1) # still the identity
and i get the same result.
The test https://github.com/JuliaSmoothOptimizers/LinearOperators.jl/blob/e663e4774d798e967003106c57acb611cd32ae78/src/lsr1.jl#L150 fails, because the scaling condition https://github.com/JuliaSmoothOptimizers/LinearOperators.jl/blob/e663e4774d798e967003106c57acb611cd32ae78/src/lsr1.jl#L146 is false.
If we suppose s = $\alpha$ * y (s,y collinear), the scaling factor ys/yy is of value $\alpha$.
Then, the scaling condition norm(y - s / scaling_factor) will always be 0 since s / scaling_factor is of value $\tfrac{\alpha y }{\alpha} = y$.
It explains why LSR1Operator(1) fails, since there is two component of size one (s,y), inevitably collinear.
When I saw this, I tried again the loop test that I made previously and I spotted an error in my numerical test
scaling_condition(s, y) = norm((y - s) / scaling_factor(s, y), 2) >= ϵ * norm(y, 2) * norm(s, 2)
which should be
scaling_condition(s, y) = norm(y - s / scaling_factor(s, y), 2) >= ϵ * norm(y, 2) * norm(s, 2)
I restarted the same loop with the modified scaling_condition
using LinearOperators, LinearAlgebra
n = 1
r = 0.0
cpt = 0
cpt_approx_null = 0
for i = 1:100
B = LSR1Operator(n)
global r
global cpt
global cpt_approx_null
s = rand(n)
y = rand(n)
ϵ = eps(eltype(y))
ymBs = y - B * s
well_defined(ymBs, s) = abs(dot(ymBs, s)) ≥ ϵ + ϵ * norm(ymBs) * norm(s, 2)
sufficient_curvature(s, y) = abs(dot(y, s)) >= ϵ * norm(y, 2) * norm(s, 2)
scaling_factor(s, y) = dot(y, s) / dot(y, y)
scaling_condition(s, y) = norm(y - s / scaling_factor(s, y), 2) >= ϵ * norm(y, 2) * norm(s, 2)
if well_defined(ymBs, s) && scaling_condition(s, y) && sufficient_curvature(s, y)
@show s, y # that way you can try the pairs that you get
push!(B, s, y)
cpt += 1
ymBs = y - B*s
norm_ymBs = norm(ymBs, 2)
r += norm_ymBs
if isapprox(norm_ymBs, 0., atol=1e-10)
cpt_approx_null += 1
end
end
end
r, cpt, cpt_approx_null # (0.0, 25, 25)
Out of 100 random pairs of vectors of size 1, only 25 passed the numerical tests (mainly scaling_condition).
One of those was
(s, y) = ([0.49377733218773445], [0.11429787904158029])
This pair passes the scaling_condition with
norm(y - s / scaling_factor(s, y), 2) # 5.551115123125783e-17 a numerical 0
superior to
ϵ * norm(y, 2) * norm(s, 2) # 1.2531687196363857e-17
And all the other pairs s,y that I tried where similar.
I don't know if this is an expected behaviour, because the collinear pairs satisfy the well_defined condition
https://github.com/JuliaSmoothOptimizers/LinearOperators.jl/blob/e663e4774d798e967003106c57acb611cd32ae78/src/lsr1.jl#L137
I made several tests on my SR1 implementation using collinear pairs s,y, and the resulting matrices satisfied the secant equation and were well-defined.
I'm not sure how to test the collinear vectors, but here is a try
s = [1., 2.]
y = [2., 4.]
myand(a,b) = a && b
s_y = (s ./ y)
collinear = (length(s) == 1) || mapreduce(i-> i == s_y[1], myand, s_y) # true
collinear check if s is a vector of size one, or if the result of an componentwise division s_y = s ./ y have the same value for every component (i.e. the same value than s_y[1]).
collinear could be use to force scaling_condition to true.
Thank you for this analysis.
I made several tests on my SR1 implementation using collinear pairs s,y, and the resulting matrices satisfied the secant equation and were well-defined.
How does your implementation differ?
My SR1 implementation is very simple, it doesn't have as many numerical conditions as push! of LSR1Operator does.
The only numerical test of my SR1 implementation is the same as
https://github.com/JuliaSmoothOptimizers/LinearOperators.jl/blob/e663e4774d798e967003106c57acb611cd32ae78/src/lsr1.jl#L150
It does not implement the sufficient_curvature and scaling_condition tests.
https://github.com/JuliaSmoothOptimizers/LinearOperators.jl/blob/e663e4774d798e967003106c57acb611cd32ae78/src/lsr1.jl#L143
https://github.com/JuliaSmoothOptimizers/LinearOperators.jl/blob/e663e4774d798e967003106c57acb611cd32ae78/src/lsr1.jl#L146
The PR #235 sets data.scaling of the current iterate to false, which implies that sufficient_curvature and scaling_condition stay true. In addition, it dosen't update scaling_factor.
In fact, I don't remember where scaling_condition came from. Maybe it was an experiment and should be removed. Clearly, it prevents updates when s and y are collinear.