MixedModels.jl icon indicating copy to clipboard operation
MixedModels.jl copied to clipboard

Error in `profile(m)` mutates `m` and fails to clean up

Open yjunechoe opened this issue 1 year ago • 4 comments

I've found that sometimes profile(m) throws ERROR: AssertionError: i ∈ iallowed || iszero(A[i, j]), which mutates the model m.

MWE with a semi-synthesized portion of the data from my usecase:

Setup

using CSV
using MixedModels
x = CSV.read(download("https://gist.githubusercontent.com/yjunechoe/af4a8dd3e2b0343fe3818e93dba8c81d/raw/x.csv"), Table);
m = fit(MixedModel, @formula(y ~ x1 * x2 + (1 | g)), x);
m
Linear mixed model fit by maximum likelihood
 y ~ 1 + x1 + x2 + x1 & x2 + (1 | g)
   logLik   -2 logLik     AIC       AICc        BIC
  -860.9412  1721.8823  1733.8823  1733.9669  1763.3289

Variance components:
            Column    Variance  Std.Dev.
g        (Intercept)  0.0022759 0.0477069
Residual              0.3264997 0.5714015
 Number of obs: 1000; levels of grouping factors: 3

  Fixed-effects parameters:
─────────────────────────────────────────────────────────
                      Coef.  Std. Error       z  Pr(>|z|)
─────────────────────────────────────────────────────────
(Intercept)      -1.3144      0.0726041  -18.10    <1e-72
x1: a2            1.1196      0.0963029   11.63    <1e-30
x2: b2           -0.0839275   0.0876926   -0.96    0.3385
x1: a2 & x2: b2   0.189803    0.105979     1.79    0.0733
─────────────────────────────────────────────────────────

Surprising behavior

profile(m, threshold=3);
# ERROR: AssertionError: i ∈ iallowed || iszero(A[i, j])
Full stacktrace
ERROR: AssertionError: i ∈ iallowed || iszero(A[i, j])
Stacktrace:
  [1] _check_locality(A::StaticArraysCore.SMatrix{3, 2, Float64, 6}, corner::Symbol, allowed_nonzeros_per_column::Int64)
    @ BSplineKit.Recombinations C:\Users\jchoe\.julia\packages\BSplineKit\OQJIx\src\Recombinations\matrices.jl:133
  [2] _RecombineMatrix
    @ C:\Users\jchoe\.julia\packages\BSplineKit\OQJIx\src\Recombinations\matrices.jl:102 [inlined]
  [3] _RecombineMatrix
    @ C:\Users\jchoe\.julia\packages\BSplineKit\OQJIx\src\Recombinations\matrices.jl:81 [inlined]
  [4] RecombineMatrix
    @ C:\Users\jchoe\.julia\packages\BSplineKit\OQJIx\src\Recombinations\matrices.jl:183 [inlined]
  [5] RecombineMatrix
    @ C:\Users\jchoe\.julia\packages\BSplineKit\OQJIx\src\Recombinations\matrices.jl:172 [inlined]
  [6] RecombinedBSplineBasis
    @ C:\Users\jchoe\.julia\packages\BSplineKit\OQJIx\src\Recombinations\bases.jl:222 [inlined]
  [7] RecombinedBSplineBasis
    @ C:\Users\jchoe\.julia\packages\BSplineKit\OQJIx\src\Recombinations\bases.jl:229 [inlined]
  [8] interpolate(x::Vector{…}, y::Vector{…}, k::BSplineKit.BSplines.BSplineOrder{…}, bc::BSplineKit.BoundaryConditions.Natural)
    @ BSplineKit.SplineInterpolations C:\Users\jchoe\.julia\packages\BSplineKit\OQJIx\src\SplineInterpolations\SplineInterpolations.jl:274
  [9] profileσs!(val::@NamedTuple{…}, tc::MixedModels.TableColumns{…}; nzlb::Float64)
    @ MixedModels C:\Users\jchoe\.julia\packages\MixedModels\hs2Ke\src\profile\vcpr.jl:81
 [10] profileσs!(val::@NamedTuple{…}, tc::MixedModels.TableColumns{…})
    @ MixedModels C:\Users\jchoe\.julia\packages\MixedModels\hs2Ke\src\profile\vcpr.jl:36
 [11] profile(m::LinearMixedModel{Float64}; threshold::Int64)
    @ MixedModels C:\Users\jchoe\.julia\packages\MixedModels\hs2Ke\src\profile\profile.jl:49
 [12] top-level scope
    @ REPL[6]:1
m
Linear mixed model fit by maximum likelihood
 y ~ 1 + x1 + x2 + x1 & x2 + (1 | g)
   logLik   -2 logLik     AIC       AICc        BIC
        NaN        NaN        NaN        NaN        NaN

Variance components:
            Column   VarianceStd.Dev.
g        (Intercept)  0.0 0.0
Residual              0.0 0.0
 Number of obs: 1000; levels of grouping factors: 3

  Fixed-effects parameters:
───────────────────────────────────────────────────────
                      Coef.  Std. Error     z  Pr(>|z|)
───────────────────────────────────────────────────────
(Intercept)      -1.3144            0.0  -Inf    <1e-99
x1: a2            1.1196            0.0   Inf    <1e-99
x2: b2           -0.0839275         0.0  -Inf    <1e-99
x1: a2 & x2: b2   0.189803          0.0   Inf    <1e-99
───────────────────────────────────────────────────────

yjunechoe avatar Nov 27 '24 17:11 yjunechoe

I don't know why "threshold=3" gives the error, but "threshold=4" appears to work.

kliegl avatar Nov 27 '24 18:11 kliegl

There are two issues here:

  1. recovery from failed profiling
  2. failure of profiling.

The first is easier to fix and I'll something addressing it in the next few days.

palday avatar Dec 09 '24 20:12 palday

I don't know why "threshold=3" gives the error, but "threshold=4" appears to work.

I think this is an artifact of execution order. If you go to threshold=4 in a clean session (without doing threshold=3 before), it should also fail.

palday avatar Dec 09 '24 20:12 palday

Okay, very weird, I get the error on threshold=3 but not threshold=4. But threshold=3 should be part of threshold=4. 🤔

palday avatar Dec 11 '24 19:12 palday

I think I understand the underlying problem with threshold=3: the cubic spline that we fit to the profiled points for interpolation just doesn't work without the extra data that threshold=4 gives you for this dataset. I don't know if it's an issue in BSplineKit's approach or a deeper numerical issue, but e.g. threshold=3.75 and above works, but anything below that doesn't. I'm going to go ahead and mark this issue as closed when #857 lands.

palday avatar Aug 25 '25 09:08 palday