statsmodels
statsmodels copied to clipboard
Zero Inflated Negative Binomial model does not converge
hi!
I would like to refer to @jfhawkin comment and ask if you could look at the convergence of ZINB model because it seems to not be able to converge for zero-inflated count data.
Here is a small example I made:
import pandas as pd
import numpy as np
import statsmodels.api as sm
from scipy.stats import nbinom
### Generate random data
size = 100 # size of the data
n = 10 # number of successes for negative binomial
p = 0.5 # probability of success for negative binomial and Poisson
zero_prob = 0.5 # probability of zero for zero-inflated distributions
zinb_data = nbinom.rvs(n, p, size=size)
zinb_data[np.random.random(size) < zero_prob] = 0
df = pd.DataFrame({
'nb': np.nan,
'zinb': zinb_data,
})
x = df['zinb']
intercept = sm.add_constant(np.ones(len(x)))
mle_zinb = sm.ZeroInflatedNegativeBinomialP(x, intercept).fit(method='bfgs', maxiter=5000, gtol=1e-12)
In my tests, I've noticed that depending on the seed sometimes model is able to converge. I use this setting of the model because I'm trying to replicate the result from zeroinfl() method in R.
I appreciate your time, please, let me know if the example is not clear, I would be happy to explain it better. Thank you!
OUTPUT:
gtol=1e-12)= sm.ZeroInflatedNegativeBinomialP(x, intercept).fit(method='bfgs', maxiter=5000 python3.10/site-packages/statsmodels/discrete/discrete_model.py:3935: RuntimeWarning: invalid value encountered in log a1 * np.log(a1) + y * np.log(mu) - /python3.10/site-packages/statsmodels/discrete/count_model.py:167: RuntimeWarning: invalid value encountered in log llf[zero_idx] = (np.log(w[zero_idx] + python3.10/site-packages/statsmodels/discrete/discrete_model.py:3972: RuntimeWarning: invalid value encountered in log dgterm = dgpart + np.log(a1 / a2) + 1 - a3 / a2 python3.10/site-packages/statsmodels/discrete/discrete_model.py:4325: RuntimeWarning: overflow encountered in exp return np.exp(linpred) python3.10/site-packages/statsmodels/discrete/discrete_model.py:3935: RuntimeWarning: invalid value encountered in multiply a1 * np.log(a1) + y * np.log(mu) - python3.10/site-packages/statsmodels/discrete/discrete_model.py:3972: RuntimeWarning: divide by zero encountered in log dgterm = dgpart + np.log(a1 / a2) + 1 - a3 / a2 python3.10/site-packages/statsmodels/discrete/discrete_model.py:3975: RuntimeWarning: invalid value encountered in multiply dparams = (a4 * dgterm - python3.10/site-packages/statsmodels/discrete/discrete_model.py:3935: RuntimeWarning: invalid value encountered in log a1 * np.log(a1) + y * np.log(mu) - python3.10/site-packages/statsmodels/discrete/count_model.py:167: RuntimeWarning: invalid value encountered in log llf[zero_idx] = (np.log(w[zero_idx] + python3.10/site-packages/statsmodels/discrete/discrete_model.py:3972: RuntimeWarning: invalid value encountered in log dgterm = dgpart + np.log(a1 / a2) + 1 - a3 / a2 python3.10/site-packages/statsmodels/discrete/discrete_model.py:3972: RuntimeWarning: invalid value encountered in log dgterm = dgpart + np.log(a1 / a2) + 1 - a3 / a2 python3.10/site-packages/statsmodels/discrete/discrete_model.py:3935: RuntimeWarning: invalid value encountered in log a1 * np.log(a1) + y * np.log(mu) - python3.10/site-packages/statsmodels/discrete/count_model.py:167: RuntimeWarning: invalid value encountered in log llf[zero_idx] = (np.log(w[zero_idx] + Warning: Desired error not necessarily achieved due to precision loss. Current function value: nan Iterations: 16 Function evaluations: 127 Gradient evaluations: 127 python3.10/site-packages/statsmodels/base/model.py:595: HessianInversionWarning: Inverting hessian failed, no bse or cov_params available warnings.warn('Inverting hessian failed, no bse or cov_params ' python3.10/site-packages/statsmodels/base/model.py:607: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals warnings.warn("Maximum Likelihood optimization failed to "
It looks like starting values in this case are not good enough for bfgs. I have not looked at the details yet and whether we can improve starting parameters for this case. I never checked a case with 50% zero prob
using "nm" as initial estimator works, but still produces a warning (most likely during optimization before being close enough to solution) with that as starting parameter, "newton" also converges without warnings
mle_zinb0 = sm.ZeroInflatedNegativeBinomialP(x, intercept).fit(method='nm', maxiter=5000)
Optimization terminated successfully.
Current function value: 1.959794
Iterations: 55
Function evaluations: 107
...\statsmodels\statsmodels\discrete\discrete_model.py:3937: RuntimeWarning: invalid value encountered in log
a1 * np.log(a1) + y * np.log(mu) -
...\statsmodels\statsmodels\discrete\discrete_model.py:3938: RuntimeWarning: invalid value encountered in log
(y + a1) * np.log(a2))
mle_zinb = sm.ZeroInflatedNegativeBinomialP(x, intercept).fit(start_params=mle_zinb0.params, method='newton', maxiter=5000)
Optimization terminated successfully.
Current function value: 1.954969
Iterations 4
print(mle_zinb.summary())
ZeroInflatedNegativeBinomialP Regression Results
=========================================================================================
Dep. Variable: zinb No. Observations: 100
Model: ZeroInflatedNegativeBinomialP Df Residuals: 99
Method: MLE Df Model: 0
Date: Sat, 17 Feb 2024 Pseudo R-squ.: -0.002468
Time: 10:17:26 Log-Likelihood: -195.98
converged: True LL-Null: -195.50
Covariance Type: nonrobust LLR p-value: nan
=================================================================================
coef std err z P>|z| [0.025 0.975]
---------------------------------------------------------------------------------
inflate_const 0.0011 0.200 0.006 0.995 -0.391 0.394
const 2.2415 0.068 33.085 0.000 2.109 2.374
alpha 0.0989 0.045 2.214 0.027 0.011 0.186
=================================================================================
starting value has inflate_const =0 which looks wrong and very strange. We should have at least bounds on it to be in interior of (0, 1). I need to check how this is parameterized. update 0 looks fine, the inflation model is Logit or Probit, 0 means prob-inflation = 0.5
AFAIR, we don't use the zero truncated model as starting parameters for the base distribution.
mle_zinb.mle_settings
{'optimizer': 'bfgs',
'start_params': array([0. , 1.74739121, 2.74769066]),
'maxiter': 5000,
'full_output': 1,
'disp': 1,
'fargs': (),
'callback': <function statsmodels.discrete.count_model.GenericZeroInflated.fit.<locals>.<lambda>(*x)>,
'retall': False,
'extra_fit_funcs': {},
'gtol': 1e-12,
'norm': inf,
'epsilon': 1.4901161193847656e-08}