LightGBM icon indicating copy to clipboard operation
LightGBM copied to clipboard

Predictions after deserializing are inconsistent on different machines

Open ubbikk opened this issue 5 years ago • 1 comments

How you are using LightGBM?

  • Python package

Environment info

Operating System: Deep Learning AMI (Ubuntu 16.04) Version 25.3(AWS) + Ubuntu 18.04(PC)

CPU/GPU model:

C++ compiler version:

CMake version:

Java version:

Python version: Python 3.6.5 :: Anaconda, Inc. (AWS) + Python 3.6.6 :: Anaconda, Inc. (PC)

R version:

Other:

LightGBM version or commit hash: 2.3.1 (both environments)

Error message and / or logs

I'm getting completely different results using the same model in different environments(AWS instance + home PC) I'm using the same code snippet on both environments(see below) The problem is very similar to what described in https://github.com/microsoft/LightGBM/issues/2449

Reproducible example(s)

import pathlib
import pandas as pd
import lightgbm as lgb

fp = pathlib.Path('models/Y_AMOUNT_ORDERS_SIGNUP365DAY/08_21__08_44')
cols = json.load(open(fp / 'cols.json'))
booster = lgb.Booster(model_file=str(fp/'model.txt'))
df = pd.read_csv('models/sample1.csv')
scores = booster.predict(df[cols])

print(scores[:20])

Steps to reproduce

1.Run the above code on two machines

  1. Results on AWS instance:
[ 0.00000000e+00  2.85494050e+02  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00 -6.76685568e-01  0.00000000e+00
  1.15639843e-01  0.00000000e+00  0.00000000e+00  3.49978851e+00
  0.00000000e+00  0.00000000e+00  2.52037043e+01  0.00000000e+00]

The results are ok. This instance is where I trained the model 3. Results on home PC:

[   0.  130.    0.    0.    0.    0.    0.    0.    0.    0.    8.    0.
  -80.    0.    0. -154.    0.    0.    3.    0.]

The results are strange. They are also 'rounded' i.e. although dtype==float64 the values are integers - no non-zero digits after dot

ubbikk avatar Aug 21 '20 10:08 ubbikk

Why model.txt and not model.pkl? Have you tried the trusted joblib.dump() / joblib.load() pair for persisting model objects (instead of plain text)?

mirekphd avatar Sep 04 '20 09:09 mirekphd

Closing this due to lack of response.

jameslamb avatar Jun 22 '24 01:06 jameslamb