Predictions after deserializing are inconsistent on different machines
How you are using LightGBM?
- Python package
Environment info
Operating System: Deep Learning AMI (Ubuntu 16.04) Version 25.3(AWS) + Ubuntu 18.04(PC)
CPU/GPU model:
C++ compiler version:
CMake version:
Java version:
Python version: Python 3.6.5 :: Anaconda, Inc. (AWS) + Python 3.6.6 :: Anaconda, Inc. (PC)
R version:
Other:
LightGBM version or commit hash: 2.3.1 (both environments)
Error message and / or logs
I'm getting completely different results using the same model in different environments(AWS instance + home PC) I'm using the same code snippet on both environments(see below) The problem is very similar to what described in https://github.com/microsoft/LightGBM/issues/2449
Reproducible example(s)
import pathlib
import pandas as pd
import lightgbm as lgb
fp = pathlib.Path('models/Y_AMOUNT_ORDERS_SIGNUP365DAY/08_21__08_44')
cols = json.load(open(fp / 'cols.json'))
booster = lgb.Booster(model_file=str(fp/'model.txt'))
df = pd.read_csv('models/sample1.csv')
scores = booster.predict(df[cols])
print(scores[:20])
Steps to reproduce
1.Run the above code on two machines
- Results on AWS instance:
[ 0.00000000e+00 2.85494050e+02 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 -6.76685568e-01 0.00000000e+00
1.15639843e-01 0.00000000e+00 0.00000000e+00 3.49978851e+00
0.00000000e+00 0.00000000e+00 2.52037043e+01 0.00000000e+00]
The results are ok. This instance is where I trained the model 3. Results on home PC:
[ 0. 130. 0. 0. 0. 0. 0. 0. 0. 0. 8. 0.
-80. 0. 0. -154. 0. 0. 3. 0.]
The results are strange. They are also 'rounded' i.e. although dtype==float64 the values are integers - no non-zero digits after dot
Why model.txt and not model.pkl? Have you tried the trusted joblib.dump() / joblib.load() pair for persisting model objects (instead of plain text)?
Closing this due to lack of response.