Enable explain integration tests for "test_engine_from_config"
Summary
How to test
Checklist
- [ ] I have added unit tests to cover my changes.
- [ ] I have added integration tests to cover my changes.
- [ ] I have added e2e tests for validation.
- [ ] I have added the description of my changes into CHANGELOG in my target branch (e.g., CHANGELOG in develop).
- [ ] I have updated the documentation in my target branch accordingly (e.g., documentation in develop).
- [ ] I have linked related issues.
License
- [ ] I submit my code changes under the same Apache License that covers the project. Feel free to contact the maintainers if that's a concern.
- [ ] I have updated the license header for each file (see an example below).
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 64.18%. Comparing base (
10f66e8) to head (c200873).
Additional details and impacted files
@@ Coverage Diff @@
## develop #3164 +/- ##
===========================================
- Coverage 64.18% 64.18% -0.01%
===========================================
Files 182 182
Lines 15067 15067
===========================================
- Hits 9671 9670 -1
- Misses 5396 5397 +1
| Flag | Coverage Δ | |
|---|---|---|
| py310 | 64.18% <ø> (-0.01%) |
:arrow_down: |
| py311 | 64.18% <ø> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Here is the interesting behavior of integration tests that shows the impact of one test on another. Ideally, test instances should run independently so I'm afraid there's some deep reason in device settings for that.
After enabling test_engine_from_config for DETECTION models it passes, but it makes ATSS models train and predict in further tests to fail with an error message, showing that inputs and weights are not on the same device:
RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.FloatTensor) should be the same.
Steps to reproduce:
- Independent tests running:
test_xai.pytests are passing
pytest tests/integration/api/test_xai.py --task=DETECTION
17 passed, 1 skipped, 208 warnings in 187.35s (0:03:07)
- Running the pipeline including
test_engine_from_configcauses tests intest_xai.pyto fail during model inference
pytest /home/gzalessk/code/training_extensions/tests/integration/api --task=DETECTION
FAILED tests/integration/api/test_engine_api.py::test_engine_from_tile_recipe[gpu-/home/gzalessk/code/training_extensions/src/otx/recipe/detection/atss_mobilenetv2_tile.yaml] - RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
FAILED tests/integration/api/test_xai.py::test_forward_explain[gpu-/home/gzalessk/code/training_extensions/src/otx/recipe/detection/atss_mobilenetv2.yaml] - RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.FloatTensor) should be the same
FAILED tests/integration/api/test_xai.py::test_forward_explain[gpu-/home/gzalessk/code/training_extensions/src/otx/recipe/detection/atss_resnext101.yaml] - RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.FloatTensor) should be the same
FAILED tests/integration/api/test_xai.py::test_forward_explain[gpu-/home/gzalessk/code/training_extensions/src/otx/recipe/detection/atss_r50_fpn.yaml] - RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.FloatTensor) should be the same
4 failed, 21 passed, 1 skipped,
Such behavior happens only for ATSS (detection task) and Mask RCNN-based (instance segmentation task).
I think, it can be connected to the device and accelerator settings for inferences in test_engine_from_config.
@harimkang Have you seen something like that?
Here is the interesting behavior of integration tests that shows the impact of one test on another. Ideally, test instances should run independently so I'm afraid there's some deep reason in device settings for that.
After enabling
test_engine_from_configfor DETECTION models it passes, but it makes ATSS modelstrainandpredictin further tests to fail with an error message, showing that inputs and weights are not on the same device:RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.FloatTensor) should be the same.Steps to reproduce:
- Independent tests running:
test_xai.pytests are passingpytest tests/integration/api/test_xai.py --task=DETECTION 17 passed, 1 skipped, 208 warnings in 187.35s (0:03:07)
- Running the pipeline including
test_engine_from_configcauses tests intest_xai.pyto fail during model inferencepytest /home/gzalessk/code/training_extensions/tests/integration/api --task=DETECTION FAILED tests/integration/api/test_engine_api.py::test_engine_from_tile_recipe[gpu-/home/gzalessk/code/training_extensions/src/otx/recipe/detection/atss_mobilenetv2_tile.yaml] - RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same FAILED tests/integration/api/test_xai.py::test_forward_explain[gpu-/home/gzalessk/code/training_extensions/src/otx/recipe/detection/atss_mobilenetv2.yaml] - RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.FloatTensor) should be the same FAILED tests/integration/api/test_xai.py::test_forward_explain[gpu-/home/gzalessk/code/training_extensions/src/otx/recipe/detection/atss_resnext101.yaml] - RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.FloatTensor) should be the same FAILED tests/integration/api/test_xai.py::test_forward_explain[gpu-/home/gzalessk/code/training_extensions/src/otx/recipe/detection/atss_r50_fpn.yaml] - RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.FloatTensor) should be the same 4 failed, 21 passed, 1 skipped,Such behavior happens only for ATSS (detection task) and Mask RCNN-based (instance segmentation task).
I think, it can be connected to the device and accelerator settings for inferences in
test_engine_from_config. @harimkang Have you seen something like that?
@eugene123tw Hi. When apply this change to the latest commit, it seems to affect other tests as Galina said. Currently the tiling related tests are not working, could you please take a look?
@harimkang @eugene123tw It seems that tiling tests are failing for the same reason as test_xai.py:
- they are running after
test_engine_from_config - they are failing during
engine.trainwith the same error thatRuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.FloatTensor) should be the same - commenting out tiling tests doesn't fix the overall problem that one test affects the other.
So it seems to me that the failure of the tiling tests is the consequence or problems with test_engine_from_config, not the root cause
@GalyaZalesskaya It appears that the device configuration in the explainable model is not properly set up. The inputs indicate CPU device, but a check within _forward_explain_detection shows cuda device.
A straightforward debugging step is to add an assertion such as assert inputs.device == next(self.buffers()).device. To address this issue directly, you can change the model's device using self.to(inputs.device). However, I recommend not to alter device during forward pass. Unfortunately, I can't provide a better solution but it's likely related to how the explainable model patches the models or the sequence in which the models are patched, leading to device mismatches.