PyTorch-Spiking-YOLOv3 icon indicating copy to clipboard operation
PyTorch-Spiking-YOLOv3 copied to clipboard

question about ann_to_snn generating snn_dag

Open shirleyatgithub opened this issue 3 years ago • 25 comments

Dear Author, Thanks for sharing the code. I encountered a problem when executing the ann_to_snn.py and wonder if you have encountered this problem. The error message is as follows: " File "/home/gss/PyTorch-Spiking-YOLOv3-main/ann_parser.py", line 102, in relu_wrapper in_nodes = [find_node_by_tensor(inp)] File "/home/gss/PyTorch-Spiking-YOLOv3-main/ann_parser.py", line 37, in find_node_by_tensor raise ValueError("cannot find tensor Size", tensor.size()) ValueError: ('cannot find tensor Size', torch.Size([1, 16, 416, 416])) " In ann_parser.py, the find_node_by_tensor requires "v is tensor", in python this means their memory are the same, but when adding ReLU layer, the input of ReLU cannot meet this condition and the rst is empty. I print the id of the tensors in this function and got the following messages: conv1 inp id 140221914107048 find node by tensor dag_input0 torch.Size([1, 3, 416, 416]) torch.Size([1, 3, 416, 416]) 140221914107048 140221914107048 add node conv1: ['dag_input0']->['conv1_out1'] conv1 out id 140221914107336 find node by tensor dag_input0 torch.Size([1, 16, 416, 416]) torch.Size([1, 3, 416, 416]) 140221914107336 140221914107048 find node by tensor conv1_out1 torch.Size([1, 16, 416, 416]) torch.Size([1, 16, 416, 416]) 140221914107336 140221914107336 batch_norm1 inp id 140221914107336 batch_norm1 out id 140221914155120 relu1 inp id 140221914106976 find node by tensor dag_input0 torch.Size([1, 16, 416, 416]) torch.Size([1, 3, 416, 416]) 140221914106976 140221914107048 find node by tensor conv1_out1 torch.Size([1, 16, 416, 416]) torch.Size([1, 16, 416, 416]) 140221914106976 140221914155120

I don't understand why the id will be different in the flow. I only change the classses from 80 to 1 and the filters from 255 to 18 accordingly in the config file "yolov3-tiny-mp2conv-mp1none-lk2relu-up2tconv.cfg". The ANN trained with the config file can be trained and tested successfully. Looking forward to your reply. @cwq159

shirleyatgithub avatar Jan 06 '22 01:01 shirleyatgithub

@shirleyatgithub Hello, Which version Pytorch is using.

$ python ann_to_snn.py --cfg cfg/yolov3-tiny.cfg --data data/coco.data --weights weights/best.pt --timesteps 128

I encountered this problem: "ValueError: ('cannot find tensor Size', torch.Size([16, 16, 320, 320])) " And I can't find the version of pytorch = 1.3.0 Do you have this problem? Looking forward to your reply.

buaa-luzhi avatar Jan 06 '22 06:01 buaa-luzhi

@buaa-luzhi yes, seems the same problem, I use torch 1.7.1. any idea of solving this problem?

shirleyatgithub avatar Jan 06 '22 06:01 shirleyatgithub

@shirleyatgithub https://github.com/cwq159/PyTorch-Spiking-YOLOv3/issues/5 But, I couldn't find a version of Pytorch=1.3.

buaa-luzhi avatar Jan 06 '22 06:01 buaa-luzhi

@buaa-luzhi why using pytorch=1.3, the requirements.txt suggests torch>=1.6.0

shirleyatgithub avatar Jan 06 '22 07:01 shirleyatgithub

@shirleyatgithub I don't know. https://github.com/cwq159/PyTorch-Spiking-YOLOv3/issues/5 I referred to this link.

buaa-luzhi avatar Jan 06 '22 09:01 buaa-luzhi

@shirleyatgithub I used Pytorch=1.7.1 and 1.4 and still had this problem.

buaa-luzhi avatar Jan 06 '22 09:01 buaa-luzhi

@buaa-luzhi I didn't find torch 1.3 either so I tried torch 1.4 cpu and python 3.7, this problem is didn't pop out but another problem pops out. ann_parser.py", line 221, in parse_ann_model model(*warpped_input) File "/home/gss/anaconda3/envs/nlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, **kwargs) File "ann_to_snn.py", line 65, in forward x = self.listi File "/home/gss/anaconda3/envs/nlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, **kwargs) TypeError: forward() missing 1 required positional argument: 'out'

shirleyatgithub avatar Jan 06 '22 11:01 shirleyatgithub

@shirleyatgithub pip install torch==1.3.1+cu100 torchvision==0.4.2+cu100 -f https://download.pytorch.org/whl/torch_stable.html

buaa-luzhi avatar Jan 06 '22 11:01 buaa-luzhi

@shirleyatgithub I'm still testing.

buaa-luzhi avatar Jan 06 '22 11:01 buaa-luzhi

@shirleyatgithub I still get this error! I don't know how to modify.

buaa-luzhi avatar Jan 06 '22 11:01 buaa-luzhi

@shirleyatgithub pip install torch==1.3.1+cu100 torchvision==0.4.2+cu100 -f https://download.pytorch.org/whl/torch_stable.html

Thank you, I will try too

shirleyatgithub avatar Jan 06 '22 12:01 shirleyatgithub

Please use pytorch1.3 with python 3.7 in this version. New version with pytorch1.7+ will be released soon.

cwq159 avatar Jan 06 '22 13:01 cwq159

@cwq159 @shirleyatgithub I used pytorch1.3 and python 3.7 and still get this error. I wonder if /cfg/yolov3-tiny-mp2conv-mp1none-lk2relu-up2tconv.cfg should be used during the training phase. Because I didn't find/CFG /yolov3-tiny-ours.cfg Thanks so much, and looking forward to your reply!

buaa-luzhi avatar Jan 07 '22 01:01 buaa-luzhi

@cwq159 @shirleyatgithub (1) The stage of training: python3 train.py --batch-size 32 --cfg cfg/yolov3-tiny-mp2conv-mp1none-lk2relu-up2tconv.cfg --data data/coco.data --weights '' (2)Transform python3 ann_to_snn.py --cfg cfg/yolov3-tiny-mp2conv-mp1none-lk2relu-up2tconv.cfg --data data/coco.data --weights weights/best.pt --timesteps 128

What's wrong with this type of training? Error reappears.... ValueError: ('cannot find tensor Size', torch.Size([16, 16, 640, 640]))

Thanks so much, and looking forward to your reply!

buaa-luzhi avatar Jan 07 '22 02:01 buaa-luzhi

Now that error doesn't exist. However, as timeSteps get larger, a memory error occurs. GPU memory is only 6GB, batch_size is 1, timesteps=32, Still display GPU memory error.

buaa-luzhi avatar Jan 07 '22 02:01 buaa-luzhi

@cwq159 Hello, when will Python=1.7 be released? Thanks!

buaa-luzhi avatar Jan 07 '22 03:01 buaa-luzhi

If you want to enlarge timesteps, you should use one GPU with enough memory. Because in this version, input data will be copied for timesteps times. Then snn will calculate the output for every copy. So the GPU memory should be large enough. New version will try to optimize the IF operation to decrease the memory usage and support for pytorch1.7+. Please look forward to it soon afterwards.

cwq159 avatar Jan 07 '22 03:01 cwq159

@cwq159 Hello, sorry to trouble you again! What type of GPU do you use. My GUP memory is small and I want to replace it with a new card. Thanks again.

buaa-luzhi avatar Jan 10 '22 04:01 buaa-luzhi

RTX8000 with 48G memory

cwq159 avatar Jan 10 '22 12:01 cwq159

thanks! That is great! how long it will take about the new code?

mengjingyouling avatar Jan 14 '22 03:01 mengjingyouling

Now that error doesn't exist. However, as timeSteps get larger, a memory error occurs. GPU memory is only 6GB, batch_size is 1, timesteps=32, Still display GPU memory error.

@buaa-luzhi Execuse me, how do you solve this error: ValueError: ('cannot find tensor Size', torch.Size([16, 16, 640, 640]))

WuTi0525 avatar Nov 28 '22 03:11 WuTi0525

@buaa-luzhi ValueError: ('找不到张量大小', torch.Size([16, 16, 640, 640]))hello,how did you solved this question?can you tell me?

jsckdon avatar Apr 25 '24 11:04 jsckdon

Sorry actually i have no idea about it.

发自我的iPhone

------------------ Original ------------------ From: jsckdon @.> Date: Thu,Apr 25,2024 7:12 PM To: cwq159/PyTorch-Spiking-YOLOv3 @.> Cc: mengjingyouling @.>, Comment @.> Subject: Re: [cwq159/PyTorch-Spiking-YOLOv3] question about ann_to_snngenerating snn_dag (Issue #38)

@buaa-luzhi ValueError: ('找不到张量大小', torch.Size([16, 16, 640, 640]))hello,how did you solved this question?can you tell me?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

mengjingyouling avatar Apr 25 '24 11:04 mengjingyouling

Sorry actually i have no idea about it. 发自我的iPhone ------------------ Original ------------------ From: jsckdon @.> Date: Thu,Apr 25,2024 7:12 PM To: cwq159/PyTorch-Spiking-YOLOv3 @.> Cc: mengjingyouling @.>, Comment @.> Subject: Re: [cwq159/PyTorch-Spiking-YOLOv3] question about ann_to_snngenerating snn_dag (Issue #38) @buaa-luzhi ValueError: ('找不到张量大小', torch.Size([16, 16, 640, 640]))hello,how did you solved this question?can you tell me? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

This code ann-snn still have some problems?Today I try to run this code and i meet this problem ,i see someone request to use python3.7 and torch 1.3,i don't konw this wether useful

jsckdon avatar Apr 25 '24 11:04 jsckdon

You can try it

发自我的iPhone

------------------ Original ------------------ From: jsckdon @.> Date: Thu,Apr 25,2024 7:47 PM To: cwq159/PyTorch-Spiking-YOLOv3 @.> Cc: mengjingyouling @.>, Comment @.> Subject: Re: [cwq159/PyTorch-Spiking-YOLOv3] question about ann_to_snngenerating snn_dag (Issue #38)

Sorry actually i have no idea about it. 发自我的iPhone … ------------------ Original ------------------ From: jsckdon @.> Date: Thu,Apr 25,2024 7:12 PM To: cwq159/PyTorch-Spiking-YOLOv3 @.> Cc: mengjingyouling @.>, Comment @.> Subject: Re: [cwq159/PyTorch-Spiking-YOLOv3] question about ann_to_snngenerating snn_dag (Issue #38) @buaa-luzhi ValueError: ('找不到张量大小', torch.Size([16, 16, 640, 640]))hello,how did you solved this question?can you tell me? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

This code ann-snn still have some problems?Today I try to run this code and i meet this problem ,i see someone request to use python3.7 and torch 1.3,i don't konw this wether useful

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

mengjingyouling avatar Apr 25 '24 11:04 mengjingyouling