glum icon indicating copy to clipboard operation
glum copied to clipboard

Inconsistent coefficient values

Open matwroblewski opened this issue 10 months ago • 2 comments

I currently have version 2.5.1 of Glum and the latest version of Tabmat (3.1.14). Following your example from the Git repository, I retrained the model twice. Surprisingly, each time I obtained slightly different coefficient values, with changes appearing from the 14th decimal place.

In an attempt to ensure consistency, I conducted a similar test using your example while including the 'random_state' parameter. Despite my expectations for stable results, discrepancies persist.

from sklearn.datasets import fetch_openml
from glum import GeneralizedLinearRegressor

house_data = fetch_openml(name="house_sales", version=3, as_frame=True)

X = house_data.data[
    [
       "bedrooms",
         "bathrooms",
         "sqft_living",
         "floors",
         "waterfront",
         "view",
         "condition",
         "grade",
         "yr_built",
         "yr_renovated",
     ]
 ].copy()

price = house_data.target
y = (price < price.median()).values.astype(int)
model = GeneralizedLinearRegressor(
    family='binomial',
    l1_ratio=1.0,
    alpha=0.001,
    random_state=1,
)

model.fit(X=X, y=y)

model.coef_[1]
# 1st fit: -0.49335439989864244
# 2nd fit: -0.49335439989865093

Differences occur for both the irls-cd and irls-ls solver.

I would greatly appreciate it if you could investigate this issue further.

matwroblewski avatar Apr 18 '24 06:04 matwroblewski

This is due to OpenMP:

(glum) ➜  ~/glum git:(main) ✗ python issue_785.py 
-0.493354399898778
-0.4933543998987746
(glum) ➜  ~/glum git:(main) ✗ OMP_NUM_THREADS=1 python issue_785.py
-0.49335439989877533
-0.49335439989877533

When you set OMP_NUM_THREADS=1, you'll get consistent results.

https://github.com/Quantco/tabmat/pull/348 addressed this for products involving a CategoricalMatrix, but apparently, that wasn't the only place, where we're running into this issue in our code base.

jtilly avatar Apr 18 '24 07:04 jtilly

Thanks for your answer. Adding an environment variable solved the problem. However, I noticed that I get different results on Windows and Ubuntu. Is this normal behavior?

model.coef_[1]
# -0.4933543998986485 Ubuntu
# -0.4933543998986456 Windows

matwroblewski avatar Apr 25 '24 19:04 matwroblewski

Different floating values on different systems from about 16 digits on should be expected, see, e.g., here.