SparseConnectivityTracer.jl icon indicating copy to clipboard operation
SparseConnectivityTracer.jl copied to clipboard

Add NLPModels testing

Open gdalle opened this issue 1 year ago • 17 comments

Fix #69

There are still disagreements between SCT and JuMP but I'm not sure we're wrong, so I've labeled them as broken tests

I also skip a few tests for which the NLP pipeline errors due to undefined variables

gdalle avatar May 22 '24 07:05 gdalle

Here are some of the errors I'm seeing

On the SCT side

@adrhill can you take a look at those?

  MethodError: no method matching hessian_pattern_to_mat(::Vector{SparseConnectivityTracer.Dual{Float64, SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}}}, ::Float64)
  
  Closest candidates are:
    hessian_pattern_to_mat(::AbstractArray{D}, ::D) where {P, T<:SparseConnectivityTracer.HessianTracer, D<:SparseConnectivityTracer.Dual{P, T}}
     @ SparseConnectivityTracer ~/Work/GitHub/Julia/SparseConnectivityTracer.jl/src/pattern.jl:387
  
  Stacktrace:
    [1] local_hessian_pattern(f::Function, x::Vector{Float64}, ::Type{BitSet}, ::Type{Set{Tuple{Int64, Int64}}})
      @ SparseConnectivityTracer ~/Work/GitHub/Julia/SparseConnectivityTracer.jl/src/pattern.jl:369
    [2] hessian_sparsity
      @ ~/Work/GitHub/Julia/SparseConnectivityTracer.jl/src/adtypes.jl:112 [inlined]
  MethodError: ^(::SparseConnectivityTracer.Dual{Float64, SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}}, ::Int64) is ambiguous.
  
  Candidates:
    ^(dx::D, y::Number) where {P, T<:SparseConnectivityTracer.HessianTracer, D<:SparseConnectivityTracer.Dual{P, T}}
      @ SparseConnectivityTracer ~/Work/GitHub/Julia/SparseConnectivityTracer.jl/src/overload_hessian.jl:120
    ^(x::Number, p::Integer)
      @ Base intfuncs.jl:311
  
  Possible fix, define
    ^(::D, ::Integer) where {P, T<:SparseConnectivityTracer.HessianTracer, D<:SparseConnectivityTracer.Dual{P, T}}
  
  Stacktrace:
    [1] literal_pow
      @ ./intfuncs.jl:351 [inlined]

On the other side

@amontoison those would be for you to investigate

  UndefVarError: `xe_turtle` not defined
  Stacktrace:
    [1] triangle_turtle(; kwargs::@Kwargs{})
      @ OptimizationProblems.ADNLPProblems ~/.julia/packages/OptimizationProblems/nfPUU/src/ADNLPProblems/triangle.jl:72

As well as plenty of errors of the form "expected Float64, got Dual" for the in-place constraints.

gdalle avatar May 22 '24 07:05 gdalle

@adrhill can you take a look at those?

  MethodError: no method matching hessian_pattern_to_mat(::Vector{SparseConnectivityTracer.Dual{Float64, SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}}}, ::Float64)

Looks like the output isn't a tracer and therefore doesn't contain information about the Hessian.

 MethodError: ^(::SparseConnectivityTracer.Dual{Float64, SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}}, ::Int64) is ambiguous.

Will fix. (EDIT: see #84)

adrhill avatar May 22 '24 07:05 adrhill

@amontoison for some of the remaining failing examples I am actually more confident in the sparsity detected by SCT than in the one detected by JuMP. Can we have a Zoom call sometime so that I show them to you? I haven't gone through them all by hand but for some of them I am almost sure that SCT is right and JuMP is wrong

gdalle avatar May 22 '24 14:05 gdalle

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 81.10%. Comparing base (d0590b7) to head (277b528).

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #83      +/-   ##
==========================================
+ Coverage   80.58%   81.10%   +0.52%     
==========================================
  Files          18       18              
  Lines         757      757              
==========================================
+ Hits          610      614       +4     
+ Misses        147      143       -4     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar May 22 '24 15:05 codecov-commenter

Looks good! Would love to have a follow-up PR that turns some of these into benchmarks.

adrhill avatar May 22 '24 15:05 adrhill

Also beware that this brings CI duration to 12 min. Might go down a bit with caching but probably not much

gdalle avatar May 22 '24 15:05 gdalle

Also beware that this brings CI duration to 12 min.

Is there a way to just test on a subset of problems?

adrhill avatar May 22 '24 16:05 adrhill

You could also add a build to just run these tests. They could run or not if you have a specific environment variable like CI_SCT.

amontoison avatar May 22 '24 16:05 amontoison

Any pointers to places where it's done successfully (and cleanly)?

gdalle avatar May 22 '24 17:05 gdalle

List of discrepancies

Jacobian

  • hs61: quadratic not supported
  • lincon: SCT has two more nonzeros in (2, 6) and (3, 7) - they are nonstructural zeros!

Hessian

"diagonal" means "JuMP has more nonzeros on the diagonal"

  • aircrfta: diagonal
  • argauss: diagonal
  • biggs5, biggs6: diagonal
  • britgas: nontrivial difference, SCT has 1087 nonzeros and JuMP 1415
  • camshape: diagonal
  • chain: diagonal
  • channel: nontrivial difference, SCT has 696 nonzeros and JuMP 768
  • controlinvestment: diagonal
  • hs106: diagonal
  • hs114: nontrivial difference, SCT has 19 nonzeros and JuMP 24
  • hs116: diagonal
  • hs250: diagonal
  • hs251: diagonal
  • hs36: diagonal
  • hs37: diagonal
  • hs40: diagonal
  • hs41: diagonal
  • hs44: diagonal
  • hs45: diagonal
  • hs56: diagonal
  • hs68: diagonal
  • hs69: diagonal
  • hs83: diagonal
  • hs84: diagonal
  • hs87 (local): diagonal
  • hs93: diagonal
  • hs95: diagonal
  • hs96: diagonal
  • hs97: diagonal
  • hs98: diagonal
  • marine: nontrivial difference, SCT has 136 nonzeros and JuMP 239
  • polygon1: diagonal
  • polygon2: diagonal
  • robotarm: diagonal
  • tetra_*: skipped because too long (for JuMP or us?)

Other questions

  • Why is local tracing necessary for some Hessians but not Jacobians?

gdalle avatar May 22 '24 19:05 gdalle

Why is local tracing necessary for some Hessians but not Jacobians?

This should not be the case. Where does this happen? Maybe we're missing an overload.

adrhill avatar May 22 '24 20:05 adrhill

This should not be the case. Where does this happen? Maybe we're missing an overload.

I'll investigate tomorrow

gdalle avatar May 22 '24 20:05 gdalle

Why is local tracing necessary for some Hessians but not Jacobians?

This should not be the case. Where does this happen? Maybe we're missing an overload.

On second thought this is probably due to the fact that we're taking the Jacobians and Hessians of different functions:

jacobian(x -> constraints(x))
hessian(x -> objective(x) + sum(constraints(x)))

gdalle avatar May 23 '24 13:05 gdalle

Goal: all tests here should pass with global sparsity only

Requires #94

gdalle avatar May 23 '24 15:05 gdalle

TODO: make these tests optional

gdalle avatar May 23 '24 15:05 gdalle

@amontoison I did some more investigation on the sparsity pattern discrepancies between us and JuMP:

  • After double-checking with ForwardDiff: every discrepancy is a non-structural zero of the matrix in question, so there is no correctness issue on either side.
  • In all cases but one (the Jacobian of lincon), SCT is more precise than JuMP, that is, the SCT nonzero pattern is included in the JuMP nonzero pattern.
  • The diagonal differences don't matter for coloring Hessians, but there are 4 cases in which non-diagonal differences occur: britgas, channel, hs114 and marine. I haven't checked what it means for coloring, but it could possibly lead to fewer colors.

In addition, we made it possible (in v0.5.0) to use global sparsity detection with control flow like x[1] > 1 or ifelse. Thus, the patterns detected by SCT should not depend on the input values (unless you do some weird shit like closures or global variables).

To sum up, I think SCT is ready for use in NLPModels, with the caveat that some Hessian sparsity detections are very slow: the largest of the tetra_* instances takes around 2 minutes. That is something I hope to fix with #103 and some other improvements.

Jacobian

Inconsistency for Jacobian of lincon: SCT (19 nz) ⊃ JuMP (17 nz) 

Hessian

Inconsistency for Hessian of aircrfta: SCT (16 nz) ⊂ JuMP (21 nz) [diagonal difference only]
Inconsistency for Hessian of argauss: SCT (8 nz) ⊂ JuMP (9 nz) [diagonal difference only]
Inconsistency for Hessian of biggs5: SCT (9 nz) ⊂ JuMP (12 nz) [diagonal difference only]
Inconsistency for Hessian of biggs6: SCT (9 nz) ⊂ JuMP (12 nz) [diagonal difference only]
Inconsistency for Hessian of britgas: SCT (1087 nz) ⊂ JuMP (1415 nz) 
Inconsistency for Hessian of camshape: SCT (395 nz) ⊂ JuMP (494 nz) [diagonal difference only]
Inconsistency for Hessian of chain: SCT (75 nz) ⊂ JuMP (100 nz) [diagonal difference only]
Inconsistency for Hessian of channel: SCT (696 nz) ⊂ JuMP (768 nz) 
Inconsistency for Hessian of controlinvestment: SCT (100 nz) ⊂ JuMP (200 nz) [diagonal difference only]
Inconsistency for Hessian of hs106: SCT (10 nz) ⊂ JuMP (18 nz) [diagonal difference only]
Inconsistency for Hessian of hs114: SCT (19 nz) ⊂ JuMP (24 nz) 
Inconsistency for Hessian of hs116: SCT (27 nz) ⊂ JuMP (34 nz) [diagonal difference only]
Inconsistency for Hessian of hs250: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
Inconsistency for Hessian of hs251: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
Inconsistency for Hessian of hs36: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
Inconsistency for Hessian of hs37: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
Inconsistency for Hessian of hs40: SCT (15 nz) ⊂ JuMP (16 nz) [diagonal difference only]
Inconsistency for Hessian of hs41: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
Inconsistency for Hessian of hs44: SCT (8 nz) ⊂ JuMP (12 nz) [diagonal difference only]
Inconsistency for Hessian of hs45: SCT (20 nz) ⊂ JuMP (25 nz) [diagonal difference only]
Inconsistency for Hessian of hs56: SCT (10 nz) ⊂ JuMP (13 nz) [diagonal difference only]
Inconsistency for Hessian of hs68: SCT (9 nz) ⊂ JuMP (10 nz) [diagonal difference only]
Inconsistency for Hessian of hs69: SCT (9 nz) ⊂ JuMP (10 nz) [diagonal difference only]
Inconsistency for Hessian of hs83: SCT (15 nz) ⊂ JuMP (19 nz) [diagonal difference only]
Inconsistency for Hessian of hs84: SCT (8 nz) ⊂ JuMP (13 nz) [diagonal difference only]
Inconsistency for Hessian of hs87: SCT (9 nz) ⊂ JuMP (11 nz) [diagonal difference only]
Inconsistency for Hessian of hs93: SCT (34 nz) ⊂ JuMP (36 nz) [diagonal difference only]
Inconsistency for Hessian of hs95: SCT (12 nz) ⊂ JuMP (17 nz) [diagonal difference only]
Inconsistency for Hessian of hs96: SCT (12 nz) ⊂ JuMP (17 nz) [diagonal difference only]
Inconsistency for Hessian of hs97: SCT (12 nz) ⊂ JuMP (17 nz) [diagonal difference only]
Inconsistency for Hessian of hs98: SCT (12 nz) ⊂ JuMP (17 nz) [diagonal difference only]
Inconsistency for Hessian of marine: SCT (186 nz) ⊂ JuMP (239 nz) 
Inconsistency for Hessian of polygon1: SCT (550 nz) ⊂ JuMP (600 nz) [diagonal difference only]
Inconsistency for Hessian of polygon2: SCT (350 nz) ⊂ JuMP (400 nz) [diagonal difference only]
Inconsistency for Hessian of robotarm: SCT (252 nz) ⊂ JuMP (321 nz) [diagonal difference only]

gdalle avatar May 27 '24 09:05 gdalle

with the caveat that some Hessian sparsity detections are very slow

This caveat should come with the meta-caveat that we haven't done any performance optimizations on Hessians yet. We have plenty of ideas that will speed things up significantly (e.g. #105, #80, #34).

adrhill avatar May 27 '24 09:05 adrhill

@gdalle can this be merged?

adrhill avatar May 28 '24 13:05 adrhill

Yep

gdalle avatar May 28 '24 14:05 gdalle