LUO77123 issues

Results 12 issues of


                                            LUO77123

batchsize=1或者2->都占用GPU内存过大，训练第一个epoch占用2.5G，到第二个epoch就达到7G，并且不收敛，什么原因喃？

python train.py --workers 8 --device 0 --batch-size 1 --data data/coco128.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights 'yolov7.pt' --name yolov7-custom ![image](https://user-images.githubusercontent.com/87272337/181875538-a56a6d0c-cccf-48cb-968c-84fdc93914e2.png) 第一个epoch占用2.59，第二个epoch突增到了7G，改成--batch-size 2，也是一样，第一个epoch占用2.59，第二个epoch突增到了7G， ![image](https://user-images.githubusercontent.com/87272337/181875591-1576d19e-7ce8-4bab-bdec-bf871c670c4c.png) 这是什么原因喃？不收敛，map基本不增加 ![image](https://user-images.githubusercontent.com/87272337/181875781-4c42443e-c174-44fc-a7da-77d22304da9d.png)

关于使用SwinTransformer的GFLOPs疑问

大佬您好，我自己根据SwinTransformer源码将YOLOV5的6.1版本的backbone全部按SwinTransformer结构修改，遇到的半精度训练报错问题看您的train文件添加了'--nohalf'参数后成功了，但是我的GFLOPs很大，光depth_multiple: 0.33，width_multiple: 0.75的yolov5swin_tiny就409.9GFLOPs，而depth_multiple: 1，width_multiple:1的yolov5swin_base达到离谱的1307GFLOPs，就算导入并冻结预训练权重也训练很慢。大佬是否遇到GFLOPs过大的问题，过大是否需要剪枝（还没接触过剪枝，请大佬赐教。感谢

debug版本报错

![image](https://user-images.githubusercontent.com/87272337/180907108-2a31a204-a7dc-437a-bc72-b71e27f2563d.png) 我不知道错在哪里喃

warning: missing return statement at end of non-void function "compare_vertices"

大佬，我在本地运行就可以，一到服务器上，就显示这个warning: missing return statement at end of non-void function "compare_vertices"，还是可以install，但是运行起来就全部是nan

大佬，您能导入窗口为16的预训练权重吗

源码改建（当我导入swinv2_tiny_patch4_window8_256.pth，使用窗口为8时候，可以正常跑代码；但是当我导入swinv2_tiny_patch4_window16_256.pth，使用窗口为16时候，导入权重出现不匹配情况；不知道如何处理，请大佬解答一下。问题如下：） RuntimeError: Error(s) in loading state_dict for Model: size mismatch for model.7.blocks.0.attn.relative_coords_table: copying a param with shape torch.Size([1, 15, 15, 2]) from checkpoint, the shape in current model...

The problem of size mismatch and training weight mismatch occurred when importing (swinv2_tiny_patch4_window16_256.pth) into Swin-v2

When I import （swinv2_tiny_patch4_window8_256.pth） and use window 8, I can run the code normally; However, when I import （swinv2_tiny_patch4_window16_256.pth） and the usage window is 16, the import weights do not...

选用长边定义法后，为什么rbox2poly是采用逆时针计算点坐标喃？

大佬，rbox2poly是采用逆时针计算点坐标 ![image](https://user-images.githubusercontent.com/87272337/189789264-c625f320-7ef4-4a41-b799-a9d381a23b77.png) 看了杨学大佬写的旋转定义，长边法很多用的顺时针呀 ![image](https://user-images.githubusercontent.com/87272337/189789391-815a6868-96f3-4bab-b6d5-cfe5ca9ba3e3.png) 然而我改成顺时针后，训练验证没问题，最高能达到77，但是测试提交就只有47，不知道什么缘由？

大佬，求助，CUDA out of memory

大佬，我用cal_iou求解旋转iou时候，进行广播机制，导致B和N就特别大，导致直接爆掉了，A100我降batchsize没用，有没有解决的办法喃。 ![image](https://user-images.githubusercontent.com/87272337/185752203-cafce912-0fbd-4afd-a3c7-82bccaaedee3.png)

关于KLD损失的疑问

作者你好，感谢你分享的工作，在阅读源码时有几个问题。您给出的KLD损失绘图如下，显示分类损失一直为0，而角度损失一直下降。 ![image](https://user-images.githubusercontent.com/87272337/170900444-d7a36b70-88d6-40dd-98be-b58c2cb122b4.png) 但是，我在阅读compute_loss_KLD中，langle 初始值为0 ， ![image](https://user-images.githubusercontent.com/87272337/170900499-50a0fc6b-8237-4fc2-b0c3-cad6a4212b9e.png) 而后，langle 未参与其他计算，就到达最后损失计算，langle *= h['angle'] *S，所以 langle=0，即 langle 一直为0，角度仅在Box损失计算中参与Iou的计算。如下图 ![image](https://user-images.githubusercontent.com/87272337/170900903-5274fd12-0cee-4152-b328-1680d35261bd.png) 我试着训练DOTA数据集，效果不好。想知道您给的数据集label是什么格式（poly的4点坐标，还是由poly经过CV外接最小矩形的带角度的坐标，由OpenCV法转化为长边法的代码在哪里喃），谢谢 PS，这是我用CSL测全部16类后，得到的ship:0.8941434649109584, 但是现在KLD得到效果不好

loss函数中，经过simOTA,3层特征图选定的正样本anchor与匹配的GT，GTxy-grid的值不在[-0.5,1.5]范围内,而是很大，问题在哪里喃

大佬，我发现经过使cost最小的正样本anchor后匹配的GT，不像yolov5限定值在[-0.5,1.5]，但是find_3_positive函数是寻找GT附近3个正样本anchor，在此基础上后续操作怎么会变成很远的框来预测当前的GT，这是为什么喃 ![image](https://user-images.githubusercontent.com/87272337/186642449-bf0b2790-3d75-4ac2-b4c4-58a2e2f15f5f.png) ![image](https://user-images.githubusercontent.com/87272337/186642711-f037a5b9-559d-4097-a3a3-efe38a39b4ff.png) 图上aaaaaa是我添加的，相对于特征图的GT框与正样本左上角x,y的差值，竟然不在【-0.5,1.5】之间从下图的selected_tbox[:, :2] -= grid也能看出来pxy = ps[:, :2].sigmoid() * 2. - 0.5值域在【-0.5,1.5】，而selected_tbox很大 ![image](https://user-images.githubusercontent.com/87272337/186647574-1491f942-fffa-45c7-b41c-6b59faf441c3.png)