InnerEye-DeepLearning
InnerEye-DeepLearning copied to clipboard
Run recovery for hyperdrive runs does not work
building_models.md
says that it is possible to recover a failed Hyperdrive crossval run, but this does not work.
File "innereye-deeplearning/InnerEye/ML/run_ml.py", line 224, in setup
self.checkpoint_handler.download_recovery_checkpoints_or_weights(only_return_path=not is_global_rank_zero())
File "innereye-deeplearning/InnerEye/ML/utils/checkpoint_handling.py", line 72, in download_recovery_checkpoints_or_weights
only_return_path=only_return_path)
File "innereye-deeplearning/InnerEye/ML/utils/run_recovery.py", line 80, in download_all_checkpoints_from_run
raise ValueError(f"AzureML run {run.id} has child runs, this method does not support those.")
ValueError: AzureML run HD_db39c3c8-279d-48ac-b353-ca99fc5308e3 has child runs, this method does not support those.