visual_token_matching icon indicating copy to clipboard operation
visual_token_matching copied to clipboard

about code recurrence bug

Open menghuaa opened this issue 1 year ago • 4 comments

https://github.com/GitGyun/visual_token_matching/blob/950bf8ef395a3e29aacb664bf4c7f2b3bd6b519d/dataset/taskonomy.py#L550 Hi, in this line,the length of self.img_paths is 35, but the maximum of class_idxs is 381915, error 'IndexError: list index out of range' will be reported during code operation. Is there any error in your code logic? Here is my data structure. Could you see if it is correct? image image image

menghuaa avatar Mar 29 '23 07:03 menghuaa

@GitGyun Hi, I also want to know the detailed information of dataset/meta_info/class_dict.pth. I think this may help me solve the problem.

menghuaa avatar Mar 29 '23 13:03 menghuaa

@GitGyun I corrected my data placement format. However, a new error has occurred: the label size for semantic segmentation is 256 * 256, while the label size for other tasks is 512 * 512. When running this line of codehttps://github.com/GitGyun/visual_token_matching/blob/950bf8ef395a3e29aacb664bf4c7f2b3bd6b519d/dataset/taskonomy.py#L453, an error 'RuntimeError: stack expects each tensor to be equal size, but got [8, 1, 512, 512] at entry 0 and [8, 1, 256, 256] at entry 1' will be reported due to size issues. image

image

########update#### I solve the error 'RuntimeError: stack expects each tensor to be equal size, but got [8, 1, 512, 512] at entry 0 and [8, 1, 256, 256]' by resizing the images and labels into (256, 256) resolution. The way I used to resize images is that behind this line https://github.com/GitGyun/visual_token_matching/blob/950bf8ef395a3e29aacb664bf4c7f2b3bd6b519d/dataset/taskonomy.py#L61, I add img=img.resize((256,256)). The way I used to resize labels is that behind this line https://github.com/GitGyun/visual_token_matching/blob/950bf8ef395a3e29aacb664bf4c7f2b3bd6b519d/dataset/taskonomy.py#L88, I add label=label.resize((256,256)). But when I run the code, the loss is nan. And I found that the value of Y ,Y_S_in and Y_Q contains NAN, so the value of Y_Q_pred also contains nan. I don't know the reason and how to solve it. I preliminary guess is that there is a problem with resizing. Can you give me the resize code? #######update again###### I found the resize code. But the taskonomy(tiny splits) I downloaded have damaged images. And I try to download it again, but it is the same. My download path is https://datasets.epfl.ch/taskonomy/links.txt. The images newfields_point_1070_view_8_domain_rgb.png、muleshoe_point_399_view_5_domain_rgb.png 、woodbine_point_1096_view_6_domain_rgb.png、pinesdale__point_1519_view_2_domain_keypoints3d.png、noxapater__point_21_view_0_domain_keypoints3d.png I downloaded is damaged. Is yours the same as mine? I want to confirm if it is damaged. When encountered the five images,the resize function will report error. May I ask how you handled these damaged images? Delete directly? If I delete the five rgb images<newfields_point_1070_view_8_domain_rgb.png、muleshoe_point_399_view_5_domain_rgb.png 、woodbine_point_1096_view_6_domain_rgb.png、pinesdale__point_1519_view_2_domain_rgb.png、noxapater__point_21_view_0_domain_rgb.png>, the error will happen again in this line<self.class_idxs = [class_idx for class_idx in class_idxs if self.img_paths[class_idx].split('')[0] in buildings] > I modify this line to <self.class_idxs = [class_idx for class_idx in class_idxs if class_idx<len(self.img_paths) and self.img_paths[class_idx].split('')[0] in buildings]>, no errors will be reported during training. But as I don't know the detailed information of dataset/meta_info/class_dict.pth', I don't know if this modification is appropriate. image image

menghuaa avatar Mar 31 '23 16:03 menghuaa

did you able to solve that issue? i am also facing the same issue

rajayarli avatar Aug 21 '23 03:08 rajayarli

Same error. Did you solve this?

Show-han avatar Sep 12 '23 14:09 Show-han