xgboost-predictor-java icon indicating copy to clipboard operation
xgboost-predictor-java copied to clipboard

the difference of predict result between using xgboost-predictor and using python

Open fenxouxiaoquan opened this issue 6 years ago • 4 comments

Hi, i train xgboost model with python and save it as 001.model,for test, i use this 001.model to predict one sample in python, get 0.223,but get 0.0604 with using xgboost-predictor,same model,same sample,but different predicted result,and i test another sample, still get different result,only the same thing is using python to predict can get larger score. i have no idea to deal with this problem.note that,my input is a hashmap,like this:{33 : 1.0,34 : 1.0,125 : 0.04261,185 : 0.01504}

fenxouxiaoquan avatar Apr 20 '18 10:04 fenxouxiaoquan

i try once more,seems the problem is gone,i just change double format to float format in the input hashmap

fenxouxiaoquan avatar Apr 20 '18 11:04 fenxouxiaoquan

@fenxouxiaoquan Hello, I've actually found the same with standard xgboost4j.

However, when I tryied this library, I got practically the same result as in python (I inserted float[] array)

In xgboost4j I was inserting DMatrix(float[] mydata) as well

How can this be possible?

hlbkin avatar Apr 30 '18 23:04 hlbkin

@hlbkin Hello, I didn't get it in fact. Dou you mean that you get the same predict result when you test it in xgboost-predictor and python? I get different predict result again today( 9.189394587749626E-5 for xgboost-predictor and 0.605 for python),just like about three months ago what i experienced, this time my input still a hashmap with nonezero values,do you have any ider to deal with this?

fenxouxiaoquan avatar Jul 18 '18 02:07 fenxouxiaoquan

I would recommend to always be using features as floats. XGBoost is explicit that it treats things as 32 bit due to performance optimizations (one example dmlc/xgboost#1410). If a model has been trained using xgboost its split values will be stored as floats and so giving it doubles may cause inaccurate predictions if hit just the right values.

cpfarrell avatar Nov 01 '19 19:11 cpfarrell