xgboost
xgboost copied to clipboard
Add custom logger
Hi. Is there any way to add a logging.logger object to a model, so all the printed output goes to the logger instead of just the screen?
Not without some hacking at the moment, see https://github.com/dmlc/xgboost/blob/912e341d575f107be1cc2631271fd0737b75dfba/python-package/xgboost/core.py#L231 .
You can use callback
to log evaluation process. Here is my hack way:
# defined your logger object first
import logging
import sys
import os
# clear previous logging configuration
for handler in logging.root.handlers[:]:
logging.root.removeHandler(handler)
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s | %(levelname)s | %(message)s')
stdout_handler = logging.StreamHandler(sys.stdout)
stdout_handler.setLevel(logging.INFO)
stdout_handler.setFormatter(formatter)
file_handler = logging.FileHandler(f'xgb_optimize.log') # the log file name
file_handler.setLevel(logging.INFO)
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)
logger.addHandler(stdout_handler)
# define logging leverage on xgb callback
# logger object is defined above
class XGBLogging(xgb.callback.TrainingCallback):
"""log train logs to file"""
def __init__(self, epoch_log_interval=100):
self.epoch_log_interval = epoch_log_interval
def after_iteration(self, model, epoch:int, evals_log:xgb.callback.TrainingCallback.EvalsLog):
if (epoch % self.epoch_log_interval == 0):
for data, metric in evals_log.items():
for metric_name, log in metric.items():
score = log[-1][0] if isinstance(log[-1], tuple) else log[-1]
logger.info(f"XGBLogging epoch {epoch} dataset {data} {metric_name} {score}")
# False to indicate training should not stop.
return False
# apply callback when .train
output = xgb.train(params=param, dtrain=dtrain,
num_boost_round=500,
custom_metric=your_own_eval_metric_func,
early_stopping_rounds=50,
callbacks=[XGBLogging(epoch_log_interval=5)],
verbose_eval=True
)
The issue for this hacking is : if I use distributed training like dask.DaskDMatrix
instead of native xgboost data structure DMatrix
, the log will NOT be saved @trivialfis .. not sure why. ..