liblinear Infinite loop or never returns for logistic regression in nearly degenerate case using scikit learn

Description

When using scikit learn, Logistic Regression never returns on fitting with nearly degenerate data. Scikit learn passed the blame on to liblinear.

Steps/Code to Reproduce

import sklearn.linear_model
import numpy as np
model = sklearn.linear_model.LogisticRegression()
num_pts = 15
x = np.zeros((num_pts*2, 2))
x[3] = 3.7491010398553741e-208
y = np.append(np.zeros(num_pts), np.ones(num_pts))
model.fit(x, y)

Expected Results

Return or throw error.

Actual Results

Never returns.

Versions

Linux-2.6.32-573.18.1.el6.x86_64-x86_64-with-redhat-6.7-Carbon ('Python', '2.7.12 |Anaconda 2.0.1 (64-bit)| (default, Jul 2 2016, 17:42:40) \n[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]') ('NumPy', '1.11.0') ('SciPy', '0.17.0') ('Scikit-Learn', '0.17.1')

Sep 26 '16 19:09 MarvinT

can you try to reproduce it with the command line interface? Otherwise it might be numerical issues caused by us (sklearn). Also, how about scaling your data ;)

Sep 30 '16 01:09 amueller

Thanks for reporting this issue. we looked into it and found the issue is coming from the too small gradient norm in the beginning, which leads to a infinite loop in conjugate gradient subroutine this issue can be fixed by setting a maximum number of CG iterations. we are going to fix it in next release. thanks

Sep 30 '16 05:09 infwinston

Thanks, that's awesome.

Sorry for not providing a more precise source of the error.

Sep 30 '16 07:09 MarvinT

This issue was moved to angleto/liblinear#10

Jan 22 '18 23:01 simsong