training_extensions icon indicating copy to clipboard operation
training_extensions copied to clipboard

ViT inference does not work by FeatureVectorHook.

Open JihwanEom opened this issue 1 year ago • 1 comments

Describe the bug

otx train throws AttributeError: 'list' object has no attribute 'size' for inference time by FeatureVectorHook.

Steps to Reproduce

otx build --task classification --backbone mmcls.VisionTransformer
cd otx-workspace-CLASSIFICATION
(set data paths in data.yaml)
otx train

Expected result: train and inference are successfully finished.

Current result:

2023-05-03 18:28:19,441 | INFO : Epoch(val) [18][1]     accuracy_top-1: 0.5600, 0 accuracy: 1.0000, 1 accuracy: 0.0000, mean accuracy: 0.5000, accuracy: 0.5600, current_iters: 18
2023-05-03 18:28:19,442 | INFO : MemCacheHandlerBase uses 0 / 0 (0.0%) memory pool and store 0 items.
2023-05-03 18:28:20,488 | INFO : called save_model
2023-05-03 18:28:21,010 | INFO : Final model performance: Performance(score: 0.76, dashboard: (14 metric groups))
2023-05-03 18:28:21,011 | INFO : train done.
2023-05-03 18:28:21,209 | INFO : infer()
2023-05-03 18:28:21,230 | INFO : Training seed was set to 5 w/ deterministic=False.
2023-05-03 18:28:21,232 | INFO : Try to create a 0 size memory pool.
2023-05-03 18:28:21,233 | INFO : configure!: training=False
load checkpoint from local path: outputs/20230503_182755_train/logs/best_epoch_10.pth
2023-05-03 18:28:21,643 | INFO : 'in_channels' config in model.head is updated from -1 to 768
2023-05-03 18:28:21,643 | INFO : configure_data()
2023-05-03 18:28:21,644 | INFO : task config!!!!: training=False
load checkpoint from local path: outputs/20230503_182755_train/logs/best_epoch_10.pth
2023-05-03 18:28:21,810 | INFO : infer!
load checkpoint from local path: outputs/20230503_182755_train/logs/best_epoch_10.pth
Traceback (most recent call last):
  File "/home/jeom/anaconda3/envs/otx-vit/bin/otx", line 8, in <module>
    sys.exit(main())
  File "/home/jeom/ws/otx-vit/otx/cli/tools/cli.py", line 77, in main
    results = globals()[f"otx_{name}"]()
  File "/home/jeom/ws/otx-vit/otx/cli/tools/train.py", line 176, in main
    return train(exit_stack)
  File "/home/jeom/ws/otx-vit/otx/cli/tools/train.py", line 272, in train
    predicted_validation_dataset = task.infer(
  File "/home/jeom/ws/otx-vit/otx/algorithms/classification/task.py", line 158, in infer
    results = self._infer_model(dataset, inference_parameters)
  File "/home/jeom/ws/otx-vit/otx/algorithms/classification/adapters/mmcls/task.py", line 300, in _infer_model
    result = model(return_loss=False, **data)
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/nncf/torch/dynamic_graph/wrappers.py", line 142, in wrapped
    return module_call(self, *args, **kwargs)
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1212, in _call_impl
    result = forward_call(*input, **kwargs)
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/mmcv/parallel/data_parallel.py", line 51, in forward
    return super().forward(*inputs, **kwargs)
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 169, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/nncf/torch/dynamic_graph/wrappers.py", line 142, in wrapped
    return module_call(self, *args, **kwargs)
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/mmcv/runner/fp16_utils.py", line 119, in new_func
    return old_func(*args, **kwargs)
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/mmcls/models/classifiers/base.py", line 85, in forward
    return self.forward_test(img, **kwargs)
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/mmcls/models/classifiers/base.py", line 67, in forward_test
    return self.simple_test(imgs[0], **kwargs)
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/mmcls/models/classifiers/image.py", line 152, in simple_test
    x = self.extract_feat(img)
  File "/home/jeom/ws/otx-vit/otx/algorithms/classification/adapters/mmcls/models/classifiers/sam_classifier.py", line 265, in extract_feat
    x = self.backbone(img)
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/nncf/torch/dynamic_graph/wrappers.py", line 142, in wrapped
    return module_call(self, *args, **kwargs)
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1215, in _call_impl
    hook_result = hook(self, input, result)
  File "/home/jeom/ws/otx-vit/otx/algorithms/common/adapters/mmcv/hooks/recording_forward_hook.py", line 71, in _recording_forward
    tensors = self.func(output)
  File "/home/jeom/ws/otx-vit/otx/core/patcher.py", line 157, in helper
    return wrapper(args[0], fn.__get__(args[0]), *args[1:], **kwargs)
  File "/home/jeom/ws/otx-vit/otx/algorithms/common/adapters/nncf/patches.py", line 59, in no_nncf_trace_wrapper
    return fn(*args, **kwargs)
  File "/home/jeom/ws/otx-vit/otx/algorithms/common/adapters/mmcv/hooks/recording_forward_hook.py", line 140, in func
    feature_vector = [torch.nn.functional.adaptive_avg_pool2d(f, (1, 1)) for f in feature_map]
  File "/home/jeom/ws/otx-vit/otx/algorithms/common/adapters/mmcv/hooks/recording_forward_hook.py", line 140, in <listcomp>
    feature_vector = [torch.nn.functional.adaptive_avg_pool2d(f, (1, 1)) for f in feature_map]
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/nncf/torch/dynamic_graph/wrappers.py", line 88, in wrapped
    op1 = operator(*args, **kwargs)
  File "/home/jeom/anaconda3/envs/otx-vit/lib/python3.10/site-packages/torch/nn/functional.py", line 1213, in adaptive_avg_pool2d
    _output_size = _list_with_default(output_size, input.size())
AttributeError: 'list' object has no attribute 'size'

Environment: Latest develop branch, commit hash: a1f098d1d33102708fa2a17078b9ebc76a052ac3

JihwanEom avatar May 03 '23 01:05 JihwanEom

@negvet it seems like we agreed to use a substitute for feature vector in PR https://github.com/openvinotoolkit/training_extensions/pull/2403. Could you review this & re-enable classification explain tests for DeiT?

sovrasov avatar Oct 04 '23 16:10 sovrasov