LightGBM How does lightgbm calculate the initial score for multiclass models?

For each tree in the multiclass model, I assumed that lightgbm would start boosting from logit(% of total classes). It appears to work this way for a binary model. However it doesn't look like this is the case for multiclass models:

import lightgbm as lgb
import numpy as np

features = np.random.uniform(0,1,(1000,10))
target = np.random.binomial(1,0.1,1000)
dtrain = lgb.Dataset(features, target)
pars = {"objective": "binary", "verbose": 2}
model = lgb.train(params=pars, train_set=dtrain, num_boost_round=1)

def logit(x):
    return np.log(x/(1-x))

# Initial score is the logit of the average target:
print(logit(target.mean()))

Prints:

...
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.102000 -> initscore=-2.175197
...

-2.1751972550179284

As you can see, the initial score is the logit of the average target. However, in the multiclass case:

target = np.random.randint(0,5,1000)
dtrain = lgb.Dataset(features, target)
pars = {"objective": "multiclass", "verbose": 2, "num_classes": 5}
model = lgb.train(params=pars, train_set=dtrain, num_boost_round=1)
class_percent = (np.arange(5).reshape(-1,1) == target).astype("int8").mean(1)

# Not equal to the initial scores printed above:
print(logit(class_percent))

This prints:

...
[LightGBM] [Info] Start training from score -1.682009
[LightGBM] [Info] Start training from score -1.514128
[LightGBM] [Info] Start training from score -1.650260
[LightGBM] [Info] Start training from score -1.551169
[LightGBM] [Info] Start training from score -1.660731
...

[-1.47621369 -1.26566637 -1.43706669 -1.31291182 -1.45001018]

In more imbalanced cases, these initial scores can be very significantly different in the multiclass model. How are these calculated?

Aug 02 '22 20:08 AnotherSamWilson

I initially thought, since multiclass uses the softmax, that the initial scores might be the result of using softmax on the logit(% of total classes) for each class. But that doesn't seem to be the case either. The initial scores according to the lightgbm printout are always lower than the theoretical calculated one, at least so far as I've seen.

Aug 04 '22 12:08 AnotherSamWilson

Hi @AnotherSamWilson, thank you for your interest in LightGBM. For multi-class objective the init score is the log of the class proportions https://github.com/microsoft/LightGBM/blob/c7102e56b246cc5cd73d9787b2c837c0bc384d1e/src/objective/multiclass_objective.hpp#L155-L157 so in your example these should match the output of np.log(class_percent).

Please let us know if you have further doubts.

Aug 07 '22 02:08 jmoralez

Interesting, from my understanding the multiclass procedure creates N binary trees and then takes the softmax of their output. Therefore, this behavior seems strange. For larger class sizes, this will cause the initial score to be very different from the desired final average output:

import numpy as np
def logit(probability):
    odds_ratio = probability/(1-probability)
    log_odds = np.log(odds_ratio)
    return log_odds

class_percent = np.array([0.01, 0.1, 0.89])

np.log(class_percent)
logit(class_percent)

In this case, a class with 89% of the data would have an initial score of -0.1165, when that class should end with an average score around 2.0907 after some boosting rounds. Is this a bug, or desired behavior because of some property I am not familiar with?

Aug 07 '22 10:08 AnotherSamWilson

The init score is defined as the raw score (before any transformation like softmax) so the actual predictions start on the class proportions, i.e.

softmax(init_score)
  = softmax(log(class_proportions)) 
  = exp(log(class_proportions)) / sum(exp(log(class_proportions))
  = class_proportions / sum(class_proportions)
  = class_proportions

Aug 08 '22 23:08 jmoralez

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one. Thank you for taking the time to improve LightGBM!

Sep 09 '22 04:09 github-actions[bot]

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

Aug 19 '23 03:08 github-actions[bot]