InnerEye-DeepLearning icon indicating copy to clipboard operation
InnerEye-DeepLearning copied to clipboard

Medical Imaging Deep Learning library to train and deploy 3D segmentation models on Azure Machine Learning

Results 110 InnerEye-DeepLearning issues
Sort by recently updated
recently updated
newest added

Files get uploaded in run_ml.py register_model function via `upload_folder`. Then job gets pre-empted, starts again and tries to upload files again. At that point, it complains that files already exist...

At the moment, we set batch size =1 in the dataloaders when running inference for a classification model. https://github.com/microsoft/InnerEye-DeepLearning/blob/daefdba6083775de7ca258d18ae315e57bcb54bd/InnerEye/ML/model_testing.py#L428 [AB#3998](https://innereye.visualstudio.com/60ce1777-00d6-4015-82bc-488a0c00202f/_workitems/edit/3998)

How can we synchronize files that are written during multi-node training? * At the end of training, each node reads the file in question, turns in to byte tensor *...

Allow finetuning an existing model on a new dataset/task. This includes support for changing the architecture (eg. swapping out the last layer) or freezing a set of weights. [AB#3921](https://innereye.visualstudio.com/60ce1777-00d6-4015-82bc-488a0c00202f/_workitems/edit/3921)

`building_models.md` says that it is possible to recover a failed Hyperdrive crossval run, but this does not work. ``` File "innereye-deeplearning/InnerEye/ML/run_ml.py", line 224, in setup self.checkpoint_handler.download_recovery_checkpoints_or_weights(only_return_path=not is_global_rank_zero()) File "innereye-deeplearning/InnerEye/ML/utils/checkpoint_handling.py", line...

At present running all the unit tests in WSL on my laptop takes 42 minutes. That is too long for me to run all the tests locally before pushing any...

In our test framework, especially for regression testing, it would be useful if we had way of comparing Jupyter Notebooks. We could also usefully look at the data and text...

Currently the baseline comparison configuration is logged at the beginning of the log file, each downloaded file is logged and then the comparison tables are printed. Add in extra logging...

Present behaviour: loss is computed per GPU Try out if we can synchronize the tensors before computing the loss, so that it is computed off a larger effective batch size...

For example, supply a generic comparison method and a baseline run [AB#4106](https://innereye.visualstudio.com/60ce1777-00d6-4015-82bc-488a0c00202f/_workitems/edit/4106)