glum
glum copied to clipboard
Inconsistent coefficient values
I currently have version 2.5.1 of Glum and the latest version of Tabmat (3.1.14). Following your example from the Git repository, I retrained the model twice. Surprisingly, each time I obtained slightly different coefficient values, with changes appearing from the 14th decimal place.
In an attempt to ensure consistency, I conducted a similar test using your example while including the 'random_state' parameter. Despite my expectations for stable results, discrepancies persist.
from sklearn.datasets import fetch_openml
from glum import GeneralizedLinearRegressor
house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
X = house_data.data[
[
"bedrooms",
"bathrooms",
"sqft_living",
"floors",
"waterfront",
"view",
"condition",
"grade",
"yr_built",
"yr_renovated",
]
].copy()
price = house_data.target
y = (price < price.median()).values.astype(int)
model = GeneralizedLinearRegressor(
family='binomial',
l1_ratio=1.0,
alpha=0.001,
random_state=1,
)
model.fit(X=X, y=y)
model.coef_[1]
# 1st fit: -0.49335439989864244
# 2nd fit: -0.49335439989865093
Differences occur for both the irls-cd and irls-ls solver.
I would greatly appreciate it if you could investigate this issue further.
This is due to OpenMP:
(glum) ➜ ~/glum git:(main) ✗ python issue_785.py
-0.493354399898778
-0.4933543998987746
(glum) ➜ ~/glum git:(main) ✗ OMP_NUM_THREADS=1 python issue_785.py
-0.49335439989877533
-0.49335439989877533
When you set OMP_NUM_THREADS=1
, you'll get consistent results.
https://github.com/Quantco/tabmat/pull/348 addressed this for products involving a CategoricalMatrix
, but apparently, that wasn't the only place, where we're running into this issue in our code base.
Thanks for your answer. Adding an environment variable solved the problem. However, I noticed that I get different results on Windows and Ubuntu. Is this normal behavior?
model.coef_[1]
# -0.4933543998986485 Ubuntu
# -0.4933543998986456 Windows
Different floating values on different systems from about 16 digits on should be expected, see, e.g., here.