verstack icon indicating copy to clipboard operation
verstack copied to clipboard

feval function for LGBMTuner

Open Scarlet3101 opened this issue 1 year ago • 2 comments

Can you add a feval function like in lightgbm.train?

It is the customized evaluation function. Each evaluation function should accept two parameters: preds, eval_data, and return (eval_name, eval_result, is_higher_better) or list of such tuples

Scarlet3101 avatar Sep 01 '23 10:09 Scarlet3101

Hi. This can be done and I'll gladly implement feval function into LGBMTuner, but your proposal has some unknowns:

If you want to pass your own evaluation function and expect LGBMTuner to know is_higher_better for this custom function, then this function that you will pass for evaluation has to have this information.

E.g. you define your eval func as follows:

def error(true, pred):
    return np.mean(true - pred)

And then you pass it to LGBMTuner as an init argument. Where should the is_higher_better be retrieved from this function?

I can retrieve a .__name__ attribute to return the name of the function that you had passed. But there is no way I can know which is better higher or lower for this custom function.

Would the feval functionality without the is_higher_better attribute be sufficient?

DanilZherebtsov avatar Dec 04 '23 14:12 DanilZherebtsov

Hi. This can be done and I'll gladly implement feval function into LGBMTuner, but your proposal has some unknowns:

If you want to pass your own evaluation function and expect LGBMTuner to know is_higher_better for this custom function, then this function that you will pass for evaluation has to have this information.

E.g. you define your eval func as follows:

def error(true, pred):
    return np.mean(true - pred)

And then you pass it to LGBMTuner as an init argument. Where should the is_higher_better be retrieved from this function?

I can retrieve a .__name__ attribute to return the name of the function that you had passed. But there is no way I can know which is better higher or lower for this custom function.

Would the feval functionality without the is_higher_better attribute be sufficient?

@DanilZherebtsov Hi!

For example, I have a training method for the lightgbm model:

y = data['price_log'] 
X = data.drop(columns=['price','price_log'])
X_train, X_test, y_train, y_test = scsplit(X, y, stratify = y,test_size = 0.3, random_state = 5)


train_dataset = lgb.Dataset(X_train, y_train,categorical_feature = categorical_cols) # Define the datasets
test_dataset = lgb.Dataset(X_test, y_test, reference=train_dataset)



def custom_mape(preds, tests):
    actuals = np.exp(tests.get_label()) if isinstance(tests, lgb.Dataset) else tests
    error = ((abs(actuals - np.exp(preds)))/actuals).sum()/actuals.shape[0]
    is_higher_better = False
    return "custom_mape", error, is_higher_better

params ={'learning_rate': 1,
 'num_leaves': 1,
 'feature_fraction': 1,
 'bagging_fraction': 1,
 'verbosity': -1,
 'random_state': 42,
 'device_type': 'cpu',
 'objective': 'regression',
 'metric': 'mape',
 'num_threads': 1,
 'num_iterations': 1,
 'early_stopping_rounds': 1}


booster = lgb.train( 
    params=params, # Get hyper parameters
    train_set=train_dataset, 
    valid_sets=[train_dataset, test_dataset], 
    valid_names=['train', 'valid'],
    verbose_eval=False,
    early_stopping_rounds=params['early_stopping_rounds'], 
    feval=custom_mape, # Call the custom MAPE
)

as you can see, my target is the price under logarithm. And for proper training, I needed to write my own evaluation method:

def custom_mape(preds, tests):
    actuals = np.exp(tests.get_label()) if isinstance(tests, lgb.Dataset) else tests
    error = ((abs(actuals - np.exp(preds)))/actuals).sum()/actuals.shape[0]
    is_higher_better = False
    return "custom_mape", error, is_higher_better

This method used in the parameters of lightGBM itself:

booster = lgb.train( 
    params=params, # Get hyper parameters
    train_set=train_dataset, 
    valid_sets=[train_dataset, test_dataset], 
    valid_names=['train', 'valid'],
    verbose_eval=False,
    early_stopping_rounds=params['early_stopping_rounds'], 
    feval=custom_mape, # Call the custom MAPE
)

The question is, is it possible to do something similar in a verstack?

Scarlet3101 avatar Jan 16 '24 06:01 Scarlet3101