InnerEye-DeepLearning icon indicating copy to clipboard operation
InnerEye-DeepLearning copied to clipboard

Add a "download files" script that recognizes patterns

Open ant0nsc opened this issue 3 years ago • 2 comments

Allow user to specify subset of results for easy download => At present I can navigate to an experiment’s azureml/ExperimentRun/dcid.my_run_id/outputs folder, but then the options for bulk download are a bit constrained. As a specific use case, I’d like to be able to download segmentation.nii.gz for all patients in the test dataset. These are all under outputs/epoch_xyz/Test, but together with the individual-organ segmentations. For head-and-neck patients, being able to download only the segmentation.nii.gz would reduce the data volume by about a factor of 20.

@kh296

AB#4218

ant0nsc avatar Jul 06 '21 10:07 ant0nsc

Example of how that could be invoked:

  • python download_files.py --run abc_123 --pattern segmentation.nii.gz would download all files that contain "segmentation.nii.gz"
  • --pattern "Test/*/segmentation.nii.gz" would restrict to the test folder.
  • Should we make pattern effectively a regex?

ant0nsc avatar Jul 06 '21 10:07 ant0nsc

The syntax looks fine to me, and making pattern effectively a regex sounds like a good idea. It would also be nice to be able to specify a list of patterns, for example: --patern "Test/*/segmentation.nii.gz Test/metrics.csv"

The folder hierarchy after download should probably be the same as before download. For example: --run abc_123 --patern "Test/*/segmentation.nii.gz" would download to abc_123/outputs/epoch*/Test/*/segmentation.nii.gz; --run abc_123 --patern "Test/metrics.csv" would download to abc_123/outputs/CrossValResults/*/epoch*/Test/metrics.csv .

kh296 avatar Jul 06 '21 10:07 kh296