I have modified the model of GLEAN, which can be trained normally and will periodically make inference on the validation set, but when the inference is completed, an error is reported as follows:
“”“
Traceback (most recent call last):
File "./tools/train.py", line 114, in
main()
File "./tools/train.py", line 107, in main
runner.train()
File "/home/qiuyuwei/anaconda3/envs/glean/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1706, in train
model = self.train_loop.run() # type: ignore
File "/home/qiuyuwei/anaconda3/envs/glean/lib/python3.8/site-packages/mmengine/runner/loops.py", line 284, in run
self.runner.val_loop.run()
File "/amax/Qyw/mmediting/mmedit/engine/runner/edit_loops.py", line 246, in run
self._runner.call_hook('after_val_epoch', metrics=multi_metric)
File "/home/qiuyuwei/anaconda3/envs/glean/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1768, in call_hook
getattr(hook, fn_name)(self, **kwargs)
File "/home/qiuyuwei/anaconda3/envs/glean/lib/python3.8/site-packages/mmengine/hooks/checkpoint_hook.py", line 329, in after_val_epoch
self._save_best_checkpoint(runner, metrics)
File "/home/qiuyuwei/anaconda3/envs/glean/lib/python3.8/site-packages/mmengine/hooks/checkpoint_hook.py", line 473, in _save_best_checkpoint
if key_score is None or not self.is_better_than[key_indicator](
File "/home/qiuyuwei/anaconda3/envs/glean/lib/python3.8/site-packages/mmengine/hooks/checkpoint_hook.py", line 117, in
rule_map = {'greater': lambda x, y: x > y, 'less': lambda x, y: x < y}
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
”“”
How should I solve the problem?