"SOLVE kinetic STEADYSTATE sparse" is not solving for steady state.
It appears that the implementation contains terms involving nt->dt
nocmodl makes use of _ss_sparse_thread which iterative solves a nonlinear system using auto const ss_dt = 1e9;
(ideally the solution should use the conservation constraints in place of an equal number of ODE to avoid trying to solve a singular matrix, but that is not directly relevant to this issue). This issue was observed when comparing NMODL and nocmodl results for nrn/share/examples/nrniv/nmodl/capmp.mod when launching nrngui capmp.hoc.
The left column is for NMODL, the right is for nocmodl. (note: I'm using NMODL branch 1uc/fix-cadifpmp-shift).
The top row shows calcium concentration as a function of time for three cell diameters. In every case the intention was to initialize the concentration to 0.01 mM (single compartment) The NMODL 100diam (black line) case calculated an initial steady state of approximately 0.0076 instead of the correct 0.01.
Ignore the bottom row. I'm showing it because the missing beginning and ending of the steady state calcium current vs constant calcium concentration for the NMODL simulation is a puzzle (but likely not relevant to the initialization issue for the large diameter case.)
I believe the steady state initialization issue is due to using the value of nt->dt in the steady state equations of the NMODL generated capmp.cpp file. Evidence for this is
oc>diam = 100
oc>init()
oc>cai
0.0076215222
oc>dt
0.025
oc>dt=10
oc>init()
oc>cai
0.0099921471
but my naive expectations are dashed with
oc>dt=1e9
oc>init()
special: /home/hines/neuron/temp/share/examples/nrniv/nmodl/x86_64/capmp.cpp:824: void neuron::nrn_init_capr(const {anonymous}::_nrn_model_sorted_token&, NrnThread*, Memb_list*, int): Assertion `false && "Newton solver did not converge!"' failed.
Aborted (core dumped)
Thank you, I can reproduce the issue and I'll debugging now. STEADY_STATE is a very good next keyword to work on.
I'm using the following simplified MOD file:
NEURON {
SUFFIX minipump
}
PARAMETER {
volA = 1e9
volB = 1e9
volC = 13.0
kf = 3.0
kb = 4.0
}
STATE {
X
Y
Z
}
INITIAL {
X = 40.0
Y = 2.0
Z = 0.0
SOLVE state STEADYSTATE sparse
}
: Required b/c of a bug in NMODL.
BREAKPOINT {
SOLVE state METHOD sparse
}
KINETIC state {
COMPARTMENT volA {X}
COMPARTMENT volB {Y}
COMPARTMENT volC {Z}
~ X + Y <-> Z (kf, kb)
CONSERVE Y + Z = 2.0
}
The equation for X in NOCMODL is:
volA * (X - X_old)/dt = - X*Y * kf + Z *kb
In NMODL we divide the above by volA. For small values of vol? the difference doesn't matter, but for volA = 1e9 we suddenly start seeing failures to converge (to the correct solution).
I'll create a PR that adopts the battle-tested scaling factors used in NOCMODL.
However, it's not enough to make capmp.mod work as expected.
I think it comes down to these three things:
- scaling of the equations,
- the the tolerances,
- one case of cancellation magic.
Scaling ODE terms:
volA * (X - X_old)/dt = - X*Y * kf + Z * kb # NOCMODL
(X - X_old)/dt = (- X*Y * kf + Z * kb)/volA # NMODL
Scaling CONSERVE terms:
volB*Y + volC*Z = const # NOCMODL
Z = (volB*Y - const)/volC # NMODL
The tolerances involved are 1e-5 with NOCMODL vs. 1e-12 for NMODL. (But since the equations are scaled different it's hard to compare the two numbers.) Note that NOCMODL uses an l1 error while NMODL uses an l2 error.
Finally, there's the issue that the CONSERVE term has very favourable cancellation properties in NOCMODL:
volB*Y + volC*Z - const
assume volB >> volC, Y ~= Z, const = volB * Y(t=0) + volC*Z(t=0). Then, we get that:
volB*Y + volC*Z == volB*Y
Hence,
(volB*Y + volC*Z) - const == 0.0 # NOCMODL
(volB*Y - const) + volC*Z == volC*Z # NMODL
then depending on the size of volC*Z this can be large compared to the tolerance of 1e-5 (simply because volC is big, but volB much bigger).
Note that NOCMODL uses an l1 error while NMODL uses an l2 error.
Not familiar with the l1 and l2 error terminology. What do those mean? Very sorry that you needed to get so deeply into these numerical error analysis weeds.
Using numpy notation:
l1_err = np.sum(np.abs(x))
l2_err = np.sum(x**2)**0.5
(it's unlikely a really important part of the discussion.)
Might as well make it a complete answer: Mathematicians like to unify, here it's the exponent one can replace with a symbol p (is the favourite):
lp_error = np.sum(np.abs(x)**p)**(1.0/p)
With a little abuse of notation one can allow p = inf:
linf_error = np.max(np.abs(x))
Since we anyway have access to the Jacobian, we can (somewhat) inexpensively do Newton error propagation: https://en.wikipedia.org/wiki/Propagation_of_uncertainty#Simplification
The idea is that it should give us an estimate of how big we'd expect round-off errors to be. Once we know that we can pick a tolerance, e.g. 100.0 "newton errors". The good thing is that the estimate is per element of F. We simply require each element of F to have converged.
It requires two fixes:
- Set
dt = 1e9before computing the steady state. - Use Newton error propagation to estimate how large roundoff error in
Fare.
If we do both we get the plot below. (I only tested this, and the minipmp.mod (posted above).)
Looks good. (I wonder where my above Assertion `false && "Newton solver did not converge!"'came from in response to oc>dt=1e9? But that's just idle curiosity.)
I don't see a segfault, only a very long error message leading to an assert followed by a core dump:
special: /home/hines/neuron/temp/share/examples/nrniv/nmodl/x86_64/capmp.cpp:824: void neuron::nrn_init_capr(const {anonymous}::_nrn_model_sorted_token&, NrnThread*, Memb_list*, int): Assertion `false && "Newton solver did not converge!"' failed.
I'm happy to follow up on the SEGFAULT, if it doesn't go away after this has been made clean and merged.
Sorry. Changed the comment. There was no segfault.
That's plausibly explained by something I didn't mention in the summary. One more difference is how F scales in dt:
(X - X_old)/dt = dX # NOCMODL
X = X_old + dt*dX # NMODL
meaning the residual would grow linearly with dt. Hence, it would make sense that it starts failing if one pushes dt to very large values.
The changes have been implemented here: #1403.