xgboost icon indicating copy to clipboard operation
xgboost copied to clipboard

[BUG] It seems that the prediction of objective "multi:softprob" is incorrect and the order of the probability is inconsistent with mllib.

Open sonetto19999 opened this issue 1 year ago • 7 comments

multi:softmax output: WeChatWorkScreenshot_2c4ac7c7-35ba-494f-8942-b0ba5f64027d

multi:softprob output: WeChatWorkScreenshot_28a6740f-d7f8-467f-9ed1-c05d35f02b2e

these two figure clearly shows that the prediction of "multi:softprob" is incorrect, the first three rows shold be 0 rathe than 1 (just like the prediction of "multi:softmax")

Meanwhile, from what I recall, the order of probabilities in MLlib is the same as the labels when the labels increase while it is not true in XGBoost

sonetto19999 avatar Jul 30 '24 05:07 sonetto19999

cc @wbo4958

trivialfis avatar Jul 30 '24 08:07 trivialfis

anyone help?

sonetto19999 avatar Aug 10 '24 09:08 sonetto19999

@sonetto19999 You might expect more help with a minimally reproducible example

mayer79 avatar Aug 20 '24 06:08 mayer79

Sorry for delay, I just see this issue. That's terrible. seems the prediction values of softprob is reversed of softmax.

Hi @sonetto19999, were you using xgboost4j-spak or xgboost pyspark?

wbo4958 avatar Aug 22 '24 23:08 wbo4958

@wbo4958 currently using xgboost4j-spark_2.12

sonetto19999 avatar Aug 23 '24 03:08 sonetto19999

I can repro this issue when setting the objective = multi:softmax or multi:softprob and num_class = 2 for the binary classification.

Hi @trivialfis, is this scenario allowed in xgboost?

wbo4958 avatar Aug 23 '24 08:08 wbo4958

If it's binary classification, it should be using binary:logistic instead of softmax.

trivialfis avatar Aug 26 '24 17:08 trivialfis