TracKit
TracKit copied to clipboard
Ocean 训练问题
您好,请教个问题,我按照教程运行python tracking/onekey.py(单独运行train_ocean.py 错误一样)的时候遇到下面报错,不知道是什么问题?
Traceback (most recent call last):
File "./tracking/train_ocean.py", line 259, in
您好,请教个问题,我按照教程运行python tracking/onekey.py(单独运行train_ocean.py 错误一样)的时候遇到下面报错,不知道是什么问题?
Traceback (most recent call last): File "./tracking/train_ocean.py", line 259, in main() File "./tracking/train_ocean.py", line 250, in main model, writer_dict = ocean_train(train_loader, model, optimizer, epoch + 1, curLR, config, writer_dict, logger, device=device) File "/data/code/siam/TracKit/tracking/../lib/core/function.py", line 54, in ocean_train loss.backward() File "/home/tm/anaconda3/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/tm/anaconda3/lib/python3.8/site-packages/torch/autograd/init.py", line 145, in backward Variable._execution_engine.run_backward( File "/home/tm/anaconda3/lib/python3.8/site-packages/torch/autograd/function.py", line 89, in apply return self._forward_cls.backward(self, *args) # type: ignore File "/home/tm/anaconda3/lib/python3.8/site-packages/torch/autograd/function.py", line 210, in wrapper outputs = fn(ctx, *args) File "/data/code/siam/TracKit/tracking/../lib/models/dcn/deform_conv.py", line 85, in backward deform_conv_cuda.deform_conv_backward_parameters_cuda( RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
看起来是你没有成功编译deform conv. 检查下环境和install.sh是不是一样。或者去掉align训练没有align的。
您好,请教个问题,我按照教程运行python tracking/onekey.py(单独运行train_ocean.py 错误一样)的时候遇到下面报错,不知道是什么问题? Traceback (most recent call last): File "./tracking/train_ocean.py", line 259, in main() File "./tracking/train_ocean.py", line 250, in main model, writer_dict = ocean_train(train_loader, model, optimizer, epoch + 1, curLR, config, writer_dict, logger, device=device) File "/data/code/siam/TracKit/tracking/../lib/core/function.py", line 54, in ocean_train loss.backward() File "/home/tm/anaconda3/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/tm/anaconda3/lib/python3.8/site-packages/torch/autograd/init.py", line 145, in backward Variable._execution_engine.run_backward( File "/home/tm/anaconda3/lib/python3.8/site-packages/torch/autograd/function.py", line 89, in apply return self._forward_cls.backward(self, *args) # type: ignore File "/home/tm/anaconda3/lib/python3.8/site-packages/torch/autograd/function.py", line 210, in wrapper outputs = fn(ctx, *args) File "/data/code/siam/TracKit/tracking/../lib/models/dcn/deform_conv.py", line 85, in backward deform_conv_cuda.deform_conv_backward_parameters_cuda( RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
看起来是你没有成功编译deform conv. 检查下环境和install.sh是不是一样。或者去掉align训练没有align的。
我重新编译了一下deform_conv 还是不行,看了一下install.sh里的配置,因为我用的30的卡,cuda是11.1,torch1.8.1,还有一个是mpi4py这个没安装成功,其他都按照install里安的,不知道有没有关系。
我自己再看看吧,谢谢回复。
@tm9161 您好 请问你解决这个问题了么,我也遇到同样的问题了
@JudasDie 您好,把align参数设置成False,就不会出现这个问题了。请问 这个参数会对性能有较大的影响吗?
@tm9161 您好 请问你解决这个问题了么,我也遇到同样的问题了
没,我也是设置了False。
@tm9161 你好,我是3080ti + cudatoolkit11.1 + torch1.8 ,在python setup.py develop这一步编译就报错了,感觉是cuda版本太高的问题,请问你遇到这个问题了吗?怎么解决的?
Please refer to the new repo. of JudasDie/SOTS. Thx.
l-sf @.***> 于2022年7月12日周二 20:37写道:
@tm9161 https://github.com/tm9161 你好,我是3080ti + cudatoolkit11.1 + torch1.8 ,在python setup.py develop这一步编译就报错了,感觉是cuda版本太高的问题,请问你遇到这个问题了吗?怎么解决的?
— Reply to this email directly, view it on GitHub https://github.com/researchmm/TracKit/issues/81#issuecomment-1181708701, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF6U2PDIDVBCFSXBLHWR7VTVTVRKFANCNFSM46HFC5YA . You are receiving this because you were mentioned.Message ID: @.***>
-- From: Zhang Zhipeng Institution: National Laboratory of Pattern Recognition Address: 95 Zhongguancun East Road, 100190, BEIJING, CHINA Email: @.***
Best Wishes