machinelearning
machinelearning copied to clipboard
Resource contention in tests DownloadImageSet
Multiple tests can run concurrently and try to download this resource to the same folder.
Build Information
Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=954239 Build error leg or test failing: Microsoft.ML.AutoML.Test.AutoFeaturizerTests.AutoFeaturizer_image_test Pull request: https://github.com/dotnet/machinelearning/pull/7388
Error Message
Fill the error message using step by step known issues guidance.
{
"ErrorMessage": "flower_photos_tiny_set_for_unit_tests.zip' because it is being used by another process",
"ErrorPattern": "",
"BuildRetry": false,
"ExcludeConsoleLog": true
}
Known issue validation
Build: :mag_right: https://dev.azure.com/dnceng-public/public/_build/results?buildId=954239
Error message validated: [flower_photos_tiny_set_for_unit_tests.zip' because it is being used by another process]
Result validation: :white_check_mark: Known issue matched with the provided build.
Validation performed at: 2/20/2025 7:12:33 PM UTC
Report
Summary
| 24-Hour Hit Count | 7-Day Hit Count | 1-Month Count |
|---|---|---|
| 0 | 0 | 0 |
Should the test dataset be downloaded to its own folder for every tests
So this file is used for TensorFlow testing and AutoML Testing. TensorFlow runs sequentially, so this issue should only happen during the AutoML tests.
I don't think we should run the AutoML tests serially (unless we really need to). I would either have the process check if the file is already downloaded and if it is then ignore it, or it could go to its own folder, or depending on how you have structured the tests we could move all tests that need this file to a single test class and we could have just that one test class run sequentially. I think that third option is my preferred approach, but what are your thoughts @LittleLittleCloud?