sagemaker-debugger
sagemaker-debugger copied to clipboard
Pytorch tensors are not saved with include_collections=["all"], only with save_all=True
Events produced are empty when using include_collections=["all"]. Works with save_all=True.
Hook code:
save_config = smd.SaveConfig(save_interval=1)
reduction_config = smd.ReductionConfig(["max", "min"])
hook = smd.Hook(out_dir='...',
reduction_config=reduction_config,
save_all=True,
#include_collections=["all"],
export_tensorboard=True,
save_config=save_config,
tensorboard_dir='...',
include_workers="all")
This occurs on all frameworks. @Cpruce you could modify your script to use either of the options below.
Option 1: add include_regex=[".*"] in the Hook above or
Option 2: explicitly add collection_manager.get("all").include(".*")
In the meantime, I'm working on adding a check in core/hook.py to include .* if include_collections contains "all"