ray_lightning
ray_lightning copied to clipboard
Can not checkpoint and log
The documentations says that when using with Ray Client, you must disable checkpointing and logging for your Trainer by setting checkpoint_callback and logger to False. So how can we log and save model during training ?
I have been doing this:
- import
TuneReportCheckpointCallback
fromray_lightning
from ray_lightning.tune import TuneReportCheckpointCallback
- Disable checkpointing with
"enable_checkpointing": False,
in the pl Trainer's configuration - Initialize logger:
tb_logger = pl_loggers.TensorBoardLogger(save_dir="/tmp/some-dir")
- Initialize tuning strategy
from ray_lightning import RayStrategy
strategy = RayStrategy(num_workers=1, num_cpus_per_worker=1, use_gpu=True)
- Initialize trainer:
trainer = pl.Trainer(
**trainer_config,
callbacks=[TuneReportCheckpointCallback({"accuracy": "accuracy"}, on="epoch_end")],
strategy=strategy,
logger=tb_logger
)