liblinear icon indicating copy to clipboard operation
liblinear copied to clipboard

Infinite loop or never returns for logistic regression in nearly degenerate case using scikit learn

Open MarvinT opened this issue 9 years ago • 4 comments

Description

When using scikit learn, Logistic Regression never returns on fitting with nearly degenerate data. Scikit learn passed the blame on to liblinear.

Steps/Code to Reproduce

import sklearn.linear_model
import numpy as np
model = sklearn.linear_model.LogisticRegression()
num_pts = 15
x = np.zeros((num_pts*2, 2))
x[3] = 3.7491010398553741e-208
y = np.append(np.zeros(num_pts), np.ones(num_pts))
model.fit(x, y)

Expected Results

Return or throw error.

Actual Results

Never returns.

Versions

Linux-2.6.32-573.18.1.el6.x86_64-x86_64-with-redhat-6.7-Carbon ('Python', '2.7.12 |Anaconda 2.0.1 (64-bit)| (default, Jul 2 2016, 17:42:40) \n[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]') ('NumPy', '1.11.0') ('SciPy', '0.17.0') ('Scikit-Learn', '0.17.1')

MarvinT avatar Sep 26 '16 19:09 MarvinT

can you try to reproduce it with the command line interface? Otherwise it might be numerical issues caused by us (sklearn). Also, how about scaling your data ;)

amueller avatar Sep 30 '16 01:09 amueller

Thanks for reporting this issue. we looked into it and found the issue is coming from the too small gradient norm in the beginning, which leads to a infinite loop in conjugate gradient subroutine this issue can be fixed by setting a maximum number of CG iterations. we are going to fix it in next release. thanks

infwinston avatar Sep 30 '16 05:09 infwinston

Thanks, that's awesome.

Sorry for not providing a more precise source of the error.

MarvinT avatar Sep 30 '16 07:09 MarvinT

This issue was moved to angleto/liblinear#10

simsong avatar Jan 22 '18 23:01 simsong