InnerEye-DeepLearning issues

Clean up spurious module loading errors

We see most builds showing repeated errors saying "Failure while loading azureml_run_type_providers. Failed to load entrypoint hyperdrive = azureml.train.hyperdrive:HyperDriveRun._from_run_dto with exception cannot import name '_DistributedTraining' from 'azureml.train._distributed_training' (/home/jaalvare/miniconda3/envs/InnerEye/lib/python3.7/site-packages/azureml/train/_distributed_training.py).". Can those...

ant0nsc

Enable monitor.py on local runs?

Tensorboard monitoring is presently hooked up to AzureML runs. It should be possible to point tensorboard to a local folder, and start the monitoring script for local runs. Upon job...

ant0nsc

Data augmentation

- GPU augmentations at training time should be configurable - Document the benefits of this feature [AB#3925](https://innereye.visualstudio.com/60ce1777-00d6-4015-82bc-488a0c00202f/_workitems/edit/3925)

javier-alvarez

Modify package such that scripts are available as commandline tools

tensorboard monitoring is something that could be made available as a commandline tool, without having to invoke it via "python ...". Same for other helpers https://python-packaging.readthedocs.io/en/latest/command-line-scripts.html [AB#3924](https://innereye.visualstudio.com/60ce1777-00d6-4015-82bc-488a0c00202f/_workitems/edit/3924)

ant0nsc

Store number of trainable parameters in config for later use

At the moment, the model summary code in `generate_and_print_model_summary` does not store any of its results. The number of trainable parameters is logged to AzureML, but not anywhere else. It...

ant0nsc

architecture

Try to increase crop size for Prostate and H&N model

After Lightning refactoring, the average memory consumption graphs show that we have reduced memory consumption. * Does that reflect reality, or is it an artefact of averaging? * If it...

ant0nsc

Add test coverage for in-situ cross validation with multiple GPUs

It is possible however that this will not work at all, or only with `ddp_spawn` as the accelerator. [AB#3913](https://innereye.visualstudio.com/60ce1777-00d6-4015-82bc-488a0c00202f/_workitems/edit/3913)

ant0nsc

architecture

Better diagnostics: Regression error plots

Regression models should at rank zero write this a prediction/target plot like this in the old code: if self._should_save_regression_error_plot(self.current_epoch): error_plot_name = f"error_plot_{self.train_val_params.epoch}" path = str(self.config.outputs_folder / f"{error_plot_name}.png") plot_variation_error_prediction(epoch_metrics.get_labels(), epoch_metrics.get_predictions(), path)...

ant0nsc

feature parity

reporting and diagnostics

not urgent

Re-enable temperature scaling

- Enable the removed tests, like `test_rnn_classifier_via_config_2` - Ensure that temperature values are logged. [AB#3912](https://innereye.visualstudio.com/60ce1777-00d6-4015-82bc-488a0c00202f/_workitems/edit/3912)

ant0nsc

feature parity

Re-enable the mean teacher model

The mean teacher model fell victim to the PL refactoring, re-enable it. - Add back all tests that use it (for example, `test_rnn_classifier_via_config_1`, `test_create_inference_pipeline`, `test_mean_teacher_model`, `test_recover_testing_from_run_recovery`). - Constructor of `ScalarLightning`...

ant0nsc

feature parity

InnerEye-DeepLearning
InnerEye-DeepLearning copied to clipboard

Metadata

Clean up spurious module loading errors

Enable monitor.py on local runs?

Data augmentation

Modify package such that scripts are available as commandline tools

Store number of trainable parameters in config for later use

Try to increase crop size for Prostate and H&N model

Add test coverage for in-situ cross validation with multiple GPUs

Better diagnostics: Regression error plots

Re-enable temperature scaling

Re-enable the mean teacher model

← Metadata

Owner

Metadata

InnerEye-DeepLearning InnerEye-DeepLearning copied to clipboard

Metadata

← Metadata

Owner

Metadata

InnerEye-DeepLearning
InnerEye-DeepLearning copied to clipboard