SAOT
SAOT copied to clipboard
loading dataset
Hi, could you give me some suggestions about the following error? Thx
pydev debugger: process 224361 is connecting
Connected to pydev debugger (build 202.7660.27)
Training: dimp saot
3612
2964
2380
1860
1404
2022-04-30 20:07:48.186097: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
No matching checkpoint file found
Training crashed at epoch 1
Traceback for the error!
Traceback (most recent call last):
File "/home/iccd/Documents/SAOT-main/ltr/trainers/base_trainer.py", line 70, in train
self.train_epoch()
File "/home/iccd/Documents/SAOT-main/ltr/trainers/ltr_trainer.py", line 80, in train_epoch
self.cycle_dataset(loader)
File "/home/iccd/Documents/SAOT-main/ltr/trainers/ltr_trainer.py", line 52, in cycle_dataset
for i, data in enumerate(loader, 1):
File "/home/iccd/miniconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
data = self._next_data()
File "/home/iccd/miniconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 475, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/home/iccd/miniconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/home/iccd/Documents/SAOT-main/ltr/data/loader.py", line 105, in ltr_collate_stack1
return TensorDict({key: ltr_collate_stack1([d[key] for d in batch]) for key in batch[0]})
File "/home/iccd/Documents/SAOT-main/ltr/data/loader.py", line 105, in <dictcomp>
return TensorDict({key: ltr_collate_stack1([d[key] for d in batch]) for key in batch[0]})
File "/home/iccd/Documents/SAOT-main/ltr/data/loader.py", line 113, in ltr_collate_stack1
return [ltr_collate_stack1(samples) for samples in transposed]
File "/home/iccd/Documents/SAOT-main/ltr/data/loader.py", line 113, in <listcomp>
return [ltr_collate_stack1(samples) for samples in transposed]
File "/home/iccd/Documents/SAOT-main/ltr/data/loader.py", line 113, in ltr_collate_stack1
return [ltr_collate_stack1(samples) for samples in transposed]
File "/home/iccd/Documents/SAOT-main/ltr/data/loader.py", line 113, in <listcomp>
return [ltr_collate_stack1(samples) for samples in transposed]
File "/home/iccd/Documents/SAOT-main/ltr/data/loader.py", line 91, in ltr_collate_stack1
if torch.utils.data.dataloader.re.search('[SaUO]', elem.dtype.str) is not None:
AttributeError: module 'torch.utils.data.dataloader' has no attribute 're'
hi,thanks for your attention to our work. We have not suffered from this issue. Could you provide more info?
It may be caused by the environment.
thank you for your reply! However, I meet another error, and I don't know how to solve it, could you give me some advice?
Restarting training from last epoch ...
No matching checkpoint file found
Training crashed at epoch 1
Traceback for the error!
Traceback (most recent call last):
File "/home/saot-main/ltr/trainers/base_trainer.py", line 70, in train
self.train_epoch()
File "/home/saot-main/ltr/trainers/ltr_trainer.py", line 80, in train_epoch
self.cycle_dataset(loader)
File "/home/saot-main/ltr/trainers/ltr_trainer.py", line 61, in cycle_dataset
loss, stats = self.actor(data)
File "/home/saot-main/ltr/actors/tracking.py", line 24, in __call__
target_scores, bboxes, cls = self.net(train_imgs=data['train_images'],
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/saot-main/ltr/models/tracking/dimpnet.py", line 69, in forward
bboxes, cls = self.state_estimator(train_feat_se, test_feat_se,
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/saot-main/ltr/models/glse/state_estimation.py", line 52, in forward
modulated_search = self.integrator(templates, subsearch_windows, graph_size)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/saot-main/ltr/models/glse/integration/integration.py", line 123, in forward
modulated_search = self.fusiongcn(search, processed_xcorr_map, normed_saliency, peak_coords, graph_size)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/saot-main/ltr/models/glse/integration/fusiongcn.py", line 127, in forward
coords_pair_kpoint, edge_weights = self.gen_kpoint_coords_pair(*graph_size, key_coords, saliency)
File "/home/saot-main/ltr/models/glse/integration/fusiongcn.py", line 225, in gen_kpoint_coords_pair
edge_weights[i, unique_key_index] = unique_key_saliency
IndexError: tensors used as indices must be long, byte or bool tensors
and when I replace edge_weights[i, unique_key_index] = unique_key_saliency
with edge_weights[i, unique_key_index.long()] = unique_key_saliency
, it still not work. A new error is reported as follow,
Restarting training from last epoch ...
No matching checkpoint file found
Training crashed at epoch 1
Traceback for the error!
Traceback (most recent call last):
File "/home/saot-main/ltr/trainers/base_trainer.py", line 70, in train
self.train_epoch()
File "/home/saot-main/ltr/trainers/ltr_trainer.py", line 80, in train_epoch
self.cycle_dataset(loader)
File "/home/saot-main/ltr/trainers/ltr_trainer.py", line 61, in cycle_dataset
loss, stats = self.actor(data)
File "/home/saot-main/ltr/actors/tracking.py", line 24, in __call__
target_scores, bboxes, cls = self.net(train_imgs=data['train_images'],
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/saot-main/ltr/models/tracking/dimpnet.py", line 69, in forward
bboxes, cls = self.state_estimator(train_feat_se, test_feat_se,
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/saot-main/ltr/models/glse/state_estimation.py", line 52, in forward
modulated_search = self.integrator(templates, subsearch_windows, graph_size)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/saot-main/ltr/models/glse/integration/integration.py", line 123, in forward
modulated_search = self.fusiongcn(search, processed_xcorr_map, normed_saliency, peak_coords, graph_size)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/saot-main/ltr/models/glse/integration/fusiongcn.py", line 127, in forward
coords_pair_kpoint, edge_weights = self.gen_kpoint_coords_pair(*graph_size, key_coords, saliency)
File "/home/saot-main/ltr/models/glse/integration/fusiongcn.py", line 225, in gen_kpoint_coords_pair
edge_weights[i, unique_key_index.long()] = unique_key_saliency
IndexError: index 328 is out of bounds for dimension 0 with size 324
Hi, I have not met this issue. Have you solved it already? Could you tell me your environment? I'll see if I could reproduce this issue.
python3.7 cuda11.1 pytorch1.7 也存在这个问题,如他所示IndexError: tensors used as indices must be long, byte or bool tensors
I trained SAOT based on cuda11.1 pytorch1.7
hi, our code is tested on cuda10, pytorch1.1. I'll test our code on a higher version pytorch. if i could solve the issue, I'll update our code.
---- Replied Message ---- | From | @.> | | Date | 06/25/2022 15:40 | | To | @.> | | Cc | Zikun @.@.> | | Subject | Re: [ZikunZhou/SAOT] loading dataset (Issue #5) |
I trained SAOT based on cuda11.1 pytorch1.7
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
I have reproduced this issue in a higher version pytorch, i will try to solve it in this week.