InnerEye-DeepLearning icon indicating copy to clipboard operation
InnerEye-DeepLearning copied to clipboard

Run recovery for hyperdrive runs does not work

Open ant0nsc opened this issue 3 years ago • 0 comments

building_models.md says that it is possible to recover a failed Hyperdrive crossval run, but this does not work.

File "innereye-deeplearning/InnerEye/ML/run_ml.py", line 224, in setup
    self.checkpoint_handler.download_recovery_checkpoints_or_weights(only_return_path=not is_global_rank_zero())
  File "innereye-deeplearning/InnerEye/ML/utils/checkpoint_handling.py", line 72, in download_recovery_checkpoints_or_weights
    only_return_path=only_return_path)
  File "innereye-deeplearning/InnerEye/ML/utils/run_recovery.py", line 80, in download_all_checkpoints_from_run
    raise ValueError(f"AzureML run {run.id} has child runs, this method does not support those.")
ValueError: AzureML run HD_db39c3c8-279d-48ac-b353-ca99fc5308e3 has child runs, this method does not support those.

AB#4248

ant0nsc avatar Jul 13 '21 22:07 ant0nsc