smartcore icon indicating copy to clipboard operation
smartcore copied to clipboard

LogisticRegression infinite loop

Open levkk opened this issue 3 years ago • 4 comments

When training on a dataset that has too many classes, e.g. Scikit's diabetes, smartcore::linear::logistic_regression::LogisticRegression runs into an infinite loop.

Reproducible example attached.

main.rs.txt

levkk avatar Sep 08 '22 02:09 levkk

Thanks for your report.

Looks like there is a problem with this call:

let result = LogisticRegression::minimize(x0, objective);

in particular, the optomization step:

optimizer.optimize(&f, &df, &x0, &ls)

Mec-iS avatar Sep 12 '22 16:09 Mec-iS

Looks like this is not a bug, the particular computation just takes a lot of time (~8 minutes in my old laptop) to compute this loop

How long does it takes with any popular Python library?

Mec-iS avatar Sep 12 '22 17:09 Mec-iS

Less than a second:

time python main.py 
[200.]

real	0m0.390s
user	0m0.592s
sys	0m0.878s

Code attached (sorry, long file because formatting).

main.py.txt

levkk avatar Sep 12 '22 17:09 levkk

The loop stops only after the limit of 1000 iterations is reached, so I suppose the algorithm is not doing its job correctly. Could you confirm that this results are correct please:

lr.coefficients().shape() -> (214, 10)
lr.intercept().shape() -> (214, 1)
lr.coefficients().get(0, 0) -> 10.926448436791203
lr.intercept().get(0, 0) -> 20.061156556933966

Mec-iS avatar Sep 13 '22 09:09 Mec-iS

This works with the new branch v0.5-wip, it is still a little slow but much better.

Probably we should move this into something Optimize Linear Regression performance

Mec-iS avatar Oct 19 '22 14:10 Mec-iS

@levkk you can see your example working at https://github.com/smartcorelib/smartcore-jupyter/blob/main/notebooks/99-Test.ipynb

Mec-iS avatar Oct 19 '22 14:10 Mec-iS