statsmodels Zero Inflated Negative Binomial model does not converge

hi!

I would like to refer to @jfhawkin comment and ask if you could look at the convergence of ZINB model because it seems to not be able to converge for zero-inflated count data.

Here is a small example I made:

import pandas as pd
import numpy as np
import statsmodels.api as sm
from scipy.stats import nbinom

### Generate random data
size = 100  # size of the data
n = 10  # number of successes for negative binomial
p = 0.5  # probability of success for negative binomial and Poisson
zero_prob = 0.5  # probability of zero for zero-inflated distributions

zinb_data = nbinom.rvs(n, p, size=size)
zinb_data[np.random.random(size) < zero_prob] = 0

df = pd.DataFrame({
            'nb': np.nan,
            'zinb': zinb_data,
        })
        
x = df['zinb']
intercept = sm.add_constant(np.ones(len(x)))

mle_zinb = sm.ZeroInflatedNegativeBinomialP(x, intercept).fit(method='bfgs', maxiter=5000, gtol=1e-12)

In my tests, I've noticed that depending on the seed sometimes model is able to converge. I use this setting of the model because I'm trying to replicate the result from zeroinfl() method in R.

I appreciate your time, please, let me know if the example is not clear, I would be happy to explain it better. Thank you!

OUTPUT:

gtol=1e-12)= sm.ZeroInflatedNegativeBinomialP(x, intercept).fit(method='bfgs', maxiter=5000 python3.10/site-packages/statsmodels/discrete/discrete_model.py:3935: RuntimeWarning: invalid value encountered in log a1 * np.log(a1) + y * np.log(mu) - /python3.10/site-packages/statsmodels/discrete/count_model.py:167: RuntimeWarning: invalid value encountered in log llf[zero_idx] = (np.log(w[zero_idx] + python3.10/site-packages/statsmodels/discrete/discrete_model.py:3972: RuntimeWarning: invalid value encountered in log dgterm = dgpart + np.log(a1 / a2) + 1 - a3 / a2 python3.10/site-packages/statsmodels/discrete/discrete_model.py:4325: RuntimeWarning: overflow encountered in exp return np.exp(linpred) python3.10/site-packages/statsmodels/discrete/discrete_model.py:3935: RuntimeWarning: invalid value encountered in multiply a1 * np.log(a1) + y * np.log(mu) - python3.10/site-packages/statsmodels/discrete/discrete_model.py:3972: RuntimeWarning: divide by zero encountered in log dgterm = dgpart + np.log(a1 / a2) + 1 - a3 / a2 python3.10/site-packages/statsmodels/discrete/discrete_model.py:3975: RuntimeWarning: invalid value encountered in multiply dparams = (a4 * dgterm - python3.10/site-packages/statsmodels/discrete/discrete_model.py:3935: RuntimeWarning: invalid value encountered in log a1 * np.log(a1) + y * np.log(mu) - python3.10/site-packages/statsmodels/discrete/count_model.py:167: RuntimeWarning: invalid value encountered in log llf[zero_idx] = (np.log(w[zero_idx] + python3.10/site-packages/statsmodels/discrete/discrete_model.py:3972: RuntimeWarning: invalid value encountered in log dgterm = dgpart + np.log(a1 / a2) + 1 - a3 / a2 python3.10/site-packages/statsmodels/discrete/discrete_model.py:3972: RuntimeWarning: invalid value encountered in log dgterm = dgpart + np.log(a1 / a2) + 1 - a3 / a2 python3.10/site-packages/statsmodels/discrete/discrete_model.py:3935: RuntimeWarning: invalid value encountered in log a1 * np.log(a1) + y * np.log(mu) - python3.10/site-packages/statsmodels/discrete/count_model.py:167: RuntimeWarning: invalid value encountered in log llf[zero_idx] = (np.log(w[zero_idx] + Warning: Desired error not necessarily achieved due to precision loss. Current function value: nan Iterations: 16 Function evaluations: 127 Gradient evaluations: 127 python3.10/site-packages/statsmodels/base/model.py:595: HessianInversionWarning: Inverting hessian failed, no bse or cov_params available warnings.warn('Inverting hessian failed, no bse or cov_params ' python3.10/site-packages/statsmodels/base/model.py:607: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals warnings.warn("Maximum Likelihood optimization failed to "

Feb 17 '24 14:02 Vlasovets

It looks like starting values in this case are not good enough for bfgs. I have not looked at the details yet and whether we can improve starting parameters for this case. I never checked a case with 50% zero prob

using "nm" as initial estimator works, but still produces a warning (most likely during optimization before being close enough to solution) with that as starting parameter, "newton" also converges without warnings

mle_zinb0 = sm.ZeroInflatedNegativeBinomialP(x, intercept).fit(method='nm', maxiter=5000)

Optimization terminated successfully.
         Current function value: 1.959794
         Iterations: 55
         Function evaluations: 107
...\statsmodels\statsmodels\discrete\discrete_model.py:3937: RuntimeWarning: invalid value encountered in log
  a1 * np.log(a1) + y * np.log(mu) -
...\statsmodels\statsmodels\discrete\discrete_model.py:3938: RuntimeWarning: invalid value encountered in log
  (y + a1) * np.log(a2))


mle_zinb = sm.ZeroInflatedNegativeBinomialP(x, intercept).fit(start_params=mle_zinb0.params, method='newton', maxiter=5000)

Optimization terminated successfully.
         Current function value: 1.954969
         Iterations 4
print(mle_zinb.summary())
                     ZeroInflatedNegativeBinomialP Regression Results                    
=========================================================================================
Dep. Variable:                              zinb   No. Observations:                  100
Model:             ZeroInflatedNegativeBinomialP   Df Residuals:                       99
Method:                                      MLE   Df Model:                            0
Date:                           Sat, 17 Feb 2024   Pseudo R-squ.:               -0.002468
Time:                                   10:17:26   Log-Likelihood:                -195.98
converged:                                  True   LL-Null:                       -195.50
Covariance Type:                       nonrobust   LLR p-value:                       nan
=================================================================================
                    coef    std err          z      P>|z|      [0.025      0.975]
---------------------------------------------------------------------------------
inflate_const     0.0011      0.200      0.006      0.995      -0.391       0.394
const             2.2415      0.068     33.085      0.000       2.109       2.374
alpha             0.0989      0.045      2.214      0.027       0.011       0.186
=================================================================================

Feb 17 '24 15:02 josef-pkt

starting value has inflate_const =0 which looks wrong and very strange. We should have at least bounds on it to be in interior of (0, 1). I need to check how this is parameterized. update 0 looks fine, the inflation model is Logit or Probit, 0 means prob-inflation = 0.5

AFAIR, we don't use the zero truncated model as starting parameters for the base distribution.

mle_zinb.mle_settings
{'optimizer': 'bfgs',
 'start_params': array([0.        , 1.74739121, 2.74769066]),
 'maxiter': 5000,
 'full_output': 1,
 'disp': 1,
 'fargs': (),
 'callback': <function statsmodels.discrete.count_model.GenericZeroInflated.fit.<locals>.<lambda>(*x)>,
 'retall': False,
 'extra_fit_funcs': {},
 'gtol': 1e-12,
 'norm': inf,
 'epsilon': 1.4901161193847656e-08}

Feb 17 '24 15:02 josef-pkt

statsmodels statsmodels copied to clipboard

Zero Inflated Negative Binomial model does not converge

statsmodels
statsmodels copied to clipboard