SuperYOLO icon indicating copy to clipboard operation
SuperYOLO copied to clipboard

Welcome to SuperYOLO Discussions!

Open icey-zhang opened this issue 1 year ago • 13 comments

Discussed in https://github.com/icey-zhang/SuperYOLO/discussions/117

Originally posted by icey-zhang June 9, 2024

👋 Welcome!

We’re using Discussions as a place to connect with other members of our community. We hope that you:

  • Ask questions you’re wondering about.
  • Share ideas.
  • Engage with other community members.
  • Welcome others and are open-minded. Remember that this is a community we build together 💪.

To get started, comment below with an introduction of yourself and tell us about what you do with this community.

icey-zhang avatar Jun 08 '24 17:06 icey-zhang

What python did you use?

jimvanoosten avatar Jun 13 '24 10:06 jimvanoosten

We run our code with Python 3.7 & 3.8 & 3.9

icey-zhang avatar Jun 19 '24 07:06 icey-zhang

Hello, I would like to ask why training from scratch does not achieve your 80.9% performance. What methods should be adopted to reach this level? Also, why is the image size passed into the MF during testing set to 544?

jacksonwu09 avatar Jun 27 '24 08:06 jacksonwu09

Hello, Thanks for sharing the code. I got the following error: AttributeError: module 'numpy' has no attribute 'int'. np.int was a deprecated alias for the builtin int. To avoid this error in existing code, use int by itself. Doing this will not modify any behavior and is safe. When replacing np.int, you may wish to use e.g. np.int64 or np.int32 to specify the precision. If you wish to review your current use, check the release note link for additional information. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

what is the best way to solve this problem should I change the environment? can any body give me the right environment versions for the libraries? or I should change the np.int. to np.int32 ?

Nadeen86 avatar Jul 07 '24 09:07 Nadeen86

Hello, thanks for sharing the code. I have a question about the code. During the training phase, the --hr_input parameter does not seem to be involved in the create_dataloader() method. I would like to ask where this parameter is applied? Second question: If my data is a non-square image, how should I adjust the parameters?

renfeiy avatar Jul 09 '24 07:07 renfeiy

are mid-level fusion ready to use? I ran the code but I have errors in mid-level fusion files

Nadeen86 avatar Jul 13 '24 11:07 Nadeen86

Meet a problem, in the file SuperYOLO/utils/datasets.py, where the function def img2label_paths(img_paths): def img2label_paths(img_paths): sa, sb = os.sep + 'images' + os.sep, os.sep + 'labels' + os.sep # /images/, /labels/ substrings return [x.replace(sa, sb, 1).replace('' + x.split('')[-1], '.txt') for x in img_paths] #replace('.' + x.split('.')[-1], '.txt') However, in the Windows platform, the function cannot work. Debuggin for a long time.........Maybe it works in the linux

corkiyao avatar Aug 03 '24 07:08 corkiyao

About the dataset used in SuperYOLO https://github.com/icey-zhang/GHOST/blob/main/data/transform_dota.py https://github.com/icey-zhang/GHOST/blob/main/data/transform_dior.py https://github.com/icey-zhang/GHOST/blob/main/data/transform_nwpu.py

icey-zhang avatar Aug 21 '24 03:08 icey-zhang

”SuperYOLO-main\dataset\VEDAI_1024\images.cache. Can not train without labels“,What is the solution to this problem for everyone? The data path in the YAML file has also been modified

wang-yt0801 avatar Sep 03 '24 04:09 wang-yt0801

Hello, your results are impressive. May I ask if the input data must be square? My current image size is 640x512

ca1wenha0 avatar Sep 12 '24 08:09 ca1wenha0

train: ./dataset/VEDAI/fold01_write.txt test: ./dataset/VEDAI/fold01test_write.txt val: ./dataset/VEDAI/fold01test_write.txt 配置文件这样 最后会报错找不到label文件 为什莫 就是根据你的代码来的啊 AssertionError: train: No labels in dataset\VEDAI_1024\images.cache. Can not train without labels.

123cjl123 avatar Oct 11 '24 10:10 123cjl123

train: ./dataset/VEDAI/fold01_write.txt test: ./dataset/VEDAI/fold01test_write.txt val: ./dataset/VEDAI/fold01test_write.txt 配置文件这样 最后会报错找不到label文件 为什莫 就是根据你的代码来的啊 AssertionError: train: No labels in dataset\VEDAI_1024\images.cache. Can not train without labels.

debug后发现,其中会有路径加载错误。你自己调试下就知道。

corkiyao avatar Oct 11 '24 11:10 corkiyao

abel文件 为什莫 就是根据你的代码来的啊

你好,问题解决了吗

1358028281 avatar Nov 23 '24 12:11 1358028281

WARNING: Dataset not found, nonexistent paths: ['E:\home\data\zhangjiaqing\dataset\VEDAI\fold01test_write.txt'] in train.py,But this path does not exist in train.py

caijincc avatar Dec 07 '24 03:12 caijincc

train: ./dataset/VEDAI/fold01_write.txt test: ./dataset/VEDAI/fold01test_write.txt val: ./dataset/VEDAI/fold01test_write.txt 配置文件这样 最后会报错找不到label文件 为什莫 就是根据你的代码来的啊 AssertionError: train: No labels in dataset\VEDAI_1024\images.cache. Can not train without labels.

我也碰到这个问题了

haattrick avatar Dec 09 '24 15:12 haattrick

关于数据集读取的问题,可以参考https://github.com/icey-zhang/SuperYOLO/issues/45

icey-zhang avatar Dec 10 '24 04:12 icey-zhang

我没有解决不知道为什么

---- 回复的原邮件 ---- | 发件人 | @.> | | 日期 | 2024年12月09日 23:44 | | 收件人 | @.> | | 抄送至 | @.>@.> | | 主题 | Re: [icey-zhang/SuperYOLO] Welcome to SuperYOLO Discussions! (Issue #118) |

train: ./dataset/VEDAI/fold01_write.txt test: ./dataset/VEDAI/fold01test_write.txt val: ./dataset/VEDAI/fold01test_write.txt 配置文件这样 最后会报错找不到label文件 为什莫 就是根据你的代码来的啊 AssertionError: train: No labels in dataset\VEDAI_1024\images.cache. Can not train without labels.

我也碰到这个问题了

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

123cjl123 avatar Dec 19 '24 02:12 123cjl123

class MF(nn.Module):# stereo attention block
    def __init__(self, channels):
        super(MF, self).__init__()
        self.mask_map_r = nn.Conv2d(channels, 1, 1, 1, 0, bias=True)
        self.mask_map_i = nn.Conv2d(1, 1, 1, 1, 0, bias=True)
        self.softmax = nn.Softmax(-1)
        self.bottleneck1 = nn.Conv2d(1, 16, 3, 1, 1, bias=False)
        self.bottleneck2 = nn.Conv2d(channels, 48, 3, 1, 1, bias=False)
        self.se = SE_Block(64,16)
        # self.se_r = SE_Block(3,3)
        # self.se_i = SE_Block(1,1)
    def forward(self, x):# B * C * H * W #x_left, x_right
        x_left_ori, x_right_ori = x[0],x[1]
        # x_left = self.se_r(x_left_ori)
        # x_right = self.se_i(x_right_ori)
        x_left = x_left_ori*0.5
        x_right = x_right_ori*0.5

        x_mask_left = torch.mul(self.mask_map_r(x_left).repeat(1,3,1,1),x_left)
        x_mask_right = torch.mul(self.mask_map_i(x_right),x_right)

        out_IR = self.bottleneck1(x_mask_right+x_right_ori)
        out_RGB = self.bottleneck2(x_mask_left+x_left_ori) #RGB
        out = self.se(torch.cat([out_RGB,out_IR],1))
        return out

您好,关于common.py里的MF类的定义,代码里将SE(论文中描述的)换成了直接乘以0.5是什么原因呢,我通过实测发现直接乘以0.5比SE效果要更好 这是训练代码: python train.py --cfg models/SRyolo_MF.yaml --super --train_img_size 1024 --hr_input --data data/SRvedai.yaml --ch 64 --input_mode RGB+IR+MF --batch-size 2

代码的测试结果: image SE的测试结果: image 两次训练参数均保持一致

mhynbnb avatar Dec 31 '24 04:12 mhynbnb

Hello, Thanks for sharing the code. I got the following error: AttributeError: module 'numpy' has no attribute 'int'. np.int was a deprecated alias for the builtin int. To avoid this error in existing code, use int by itself. Doing this will not modify any behavior and is safe. When replacing np.int, you may wish to use e.g. np.int64 or np.int32 to specify the precision. If you wish to review your current use, check the release note link for additional information. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

what is the best way to solve this problem should I change the environment? can any body give me the right environment versions for the libraries? or I should change the np.int. to np.int32 ?

did you solve the problem? I came up with the same problem

LennoxWei avatar Jan 05 '25 08:01 LennoxWei

class MF(nn.Module):# stereo attention block def init(self, channels): super(MF, self).init() self.mask_map_r = nn.Conv2d(channels, 1, 1, 1, 0, bias=True) self.mask_map_i = nn.Conv2d(1, 1, 1, 1, 0, bias=True) self.softmax = nn.Softmax(-1) self.bottleneck1 = nn.Conv2d(1, 16, 3, 1, 1, bias=False) self.bottleneck2 = nn.Conv2d(channels, 48, 3, 1, 1, bias=False) self.se = SE_Block(64,16)

self.se_r = SE_Block(3,3)

self.se_i = SE_Block(1,1)

def forward(self, x):# B * C * H * W #x_left, x_right x_left_ori, x_right_ori = x[0],x[1]

x_left = self.se_r(x_left_ori)

x_right = self.se_i(x_right_ori)

x_left = x_left_ori_0.5 x_right = x_right_ori_0.5

    x_mask_left = torch.mul(self.mask_map_r(x_left).repeat(1,3,1,1),x_left)
    x_mask_right = torch.mul(self.mask_map_i(x_right),x_right)

    out_IR = self.bottleneck1(x_mask_right+x_right_ori)
    out_RGB = self.bottleneck2(x_mask_left+x_left_ori) #RGB
    out = self.se(torch.cat([out_RGB,out_IR],1))
    return out

您好,关于common.py里的MF类的定义,代码里将SE(论文中描述的)换成了直接乘以0.5是什么原因呢,我通过实测发现直接乘以0.5比SE效果要更好 这是训练代码: python train.py --cfg models/SRyolo_MF.yaml --super --train_img_size 1024 --hr_input --data data/SRvedai.yaml --ch 64 --input_mode RGB+IR+MF --batch-size 2 代码的测试结果: image SE的测试结果: image 两次训练参数均保持一致

请问代码的测试的结果和乘以0.5的结果是改了哪里吗

models/common.py line197-202

mhynbnb avatar Mar 04 '25 03:03 mhynbnb

class MF(nn.Module):# stereo attention block def init(self, channels): super(MF, self).init() self.mask_map_r = nn.Conv2d(channels, 1, 1, 1, 0, bias=True) self.mask_map_i = nn.Conv2d(1, 1, 1, 1, 0, bias=True) self.softmax = nn.Softmax(-1) self.bottleneck1 = nn.Conv2d(1, 16, 3, 1, 1, bias=False) self.bottleneck2 = nn.Conv2d(channels, 48, 3, 1, 1, bias=False) self.se = SE_Block(64,16)

self.se_r = SE_Block(3,3)

self.se_i = SE_Block(1,1)

def forward(self, x):# B * C * H * W #x_left, x_right x_left_ori, x_right_ori = x[0],x[1]

x_left = self.se_r(x_left_ori)

x_right = self.se_i(x_right_ori)

x_left = x_left_ori_0.5 x_right = x_right_ori_0.5

    x_mask_left = torch.mul(self.mask_map_r(x_left).repeat(1,3,1,1),x_left)
    x_mask_right = torch.mul(self.mask_map_i(x_right),x_right)

    out_IR = self.bottleneck1(x_mask_right+x_right_ori)
    out_RGB = self.bottleneck2(x_mask_left+x_left_ori) #RGB
    out = self.se(torch.cat([out_RGB,out_IR],1))
    return out

您好,关于common.py里的MF类的定义,代码里将SE(论文中描述的)换成了直接乘以0.5是什么原因呢,我通过实测发现直接乘以0.5比SE效果要更好 这是训练代码: python train.py --cfg models/SRyolo_MF.yaml --super --train_img_size 1024 --hr_input --data data/SRvedai.yaml --ch 64 --input_mode RGB+IR+MF --batch-size 2 代码的测试结果: image SE的测试结果: image 两次训练参数均保持一致

请问代码的测试的结果和乘以0.5的结果是改了哪里吗

代码里注释掉的部分跟原论文结构是相符合的,但作者将其更改为了乘以0.5

mhynbnb avatar Mar 04 '25 03:03 mhynbnb

class MF(nn.Module):# stereo attention blockclass MF(nn.Module):# 立体注意力模块 def init(self, channels):def init (自身,渠道): super(MF, self).init()超级(MF,self)。 init () self.mask_map_r = nn.Conv2d(channels, 1, 1, 1, 0, bias=True)self.mask_map_r = nn.Conv2d(通道,1,1,1,0,偏差=True) self.mask_map_i = nn.Conv2d(1, 1, 1, 1, 0, bias=True)self.mask_map_i = nn.Conv2d(1, 1, 1, 1, 0, 偏差=True) self.softmax = nn.Softmax(-1)自我.softmax = nn.Softmax(-1) self.bottleneck1 = nn.Conv2d(1, 16, 3, 1, 1, bias=False)自我.瓶颈1 = nn.Conv2d(1,16,3,1,1,偏差=False) self.bottleneck2 = nn.Conv2d(channels, 48, 3, 1, 1, bias=False)self.bottleneck2 = nn.Conv2d(通道,48,3,1,1,偏差=False) self.se = SE_Block(64,16)自我.se = SE_Block(64,16)

self.se_r = SE_Block(3,3)

self.se_i = SE_Block(1,1)自我.se_i = SE_Block(1,1)

def forward(self, x):# B * C * H * W #x_left, x_rightdef forward(self,x):# B * C * H * W#x_left,x_right x_left_ori, x_right_ori = x[0],x[1]x_left_ori,x_right_ori = x[0],x[1]

x_left = self.se_r(x_left_ori)x_left = self.se_r(x_left_ori)

x_right = self.se_i(x_right_ori)x_right = 自身.se_i(x_right_ori)

x_left = x_left_ori_0.5 x_right = x_right_ori_0.5

    x_mask_left = torch.mul(self.mask_map_r(x_left).repeat(1,3,1,1),x_left)
    x_mask_right = torch.mul(self.mask_map_i(x_right),x_right)

    out_IR = self.bottleneck1(x_mask_right+x_right_ori)
    out_RGB = self.bottleneck2(x_mask_left+x_left_ori) #RGB
    out = self.se(torch.cat([out_RGB,out_IR],1))
    return out

您好,关于common.py里的MF类的定义,代码里将SE(论文中描述的)换成了直接乘以0.5是什么原因呢,我通过实测发现直接乘以0.5比SE效果要更好 这是训练代码: python train.py --cfg models/SRyolo_MF.yaml --super --train_img_size 1024 --hr_input --data data/SRvedai.yaml --ch 64 --input_mode RGB+IR+MF --batch-size 2 代码的测试结果: image SE的测试结果: image 两次训练参数均保持一致

请问代码的测试的结果和乘以0.5的结果是改了哪里吗

代码里注释掉的部分跟原论文结构是相符合的,但作者将其更改为了乘以0.5

您好,我想在请教您一下,我找了很久没有发现作者在哪个地方调用了MF这个模块进行图像的融合,请问你有找到吗

Thomas-403 avatar Mar 07 '25 03:03 Thomas-403

class MF(nn.Module):# stereo attention blockclass MF(nn.Module):# 立体注意力模块 def init(self, channels):def init (自身,渠道): super(MF, self).init()超级(MF,self)。 init () self.mask_map_r = nn.Conv2d(channels, 1, 1, 1, 0, bias=True)self.mask_map_r = nn.Conv2d(通道,1,1,1,0,偏差=True) self.mask_map_i = nn.Conv2d(1, 1, 1, 1, 0, bias=True)self.mask_map_i = nn.Conv2d(1, 1, 1, 1, 0, 偏差=True) self.softmax = nn.Softmax(-1)自我.softmax = nn.Softmax(-1) self.bottleneck1 = nn.Conv2d(1, 16, 3, 1, 1, bias=False)自我.瓶颈1 = nn.Conv2d(1,16,3,1,1,偏差=False) self.bottleneck2 = nn.Conv2d(channels, 48, 3, 1, 1, bias=False)self.bottleneck2 = nn.Conv2d(通道,48,3,1,1,偏差=False) self.se = SE_Block(64,16)自我.se = SE_Block(64,16)

self.se_r = SE_Block(3,3)

self.se_i = SE_Block(1,1)自我.se_i = SE_Block(1,1)

def forward(self, x):# B * C * H * W #x_left, x_rightdef forward(self,x):# B * C * H * W#x_left,x_right x_left_ori, x_right_ori = x[0],x[1]x_left_ori,x_right_ori = x[0],x[1]

x_left = self.se_r(x_left_ori)x_left = self.se_r(x_left_ori)

x_right = self.se_i(x_right_ori)x_right = 自身.se_i(x_right_ori)

x_left = x_left_ori_0.5 x_right = x_right_ori_0.5

    x_mask_left = torch.mul(self.mask_map_r(x_left).repeat(1,3,1,1),x_left)
    x_mask_right = torch.mul(self.mask_map_i(x_right),x_right)

    out_IR = self.bottleneck1(x_mask_right+x_right_ori)
    out_RGB = self.bottleneck2(x_mask_left+x_left_ori) #RGB
    out = self.se(torch.cat([out_RGB,out_IR],1))
    return out

您好,关于common.py里的MF类的定义,代码里将SE(论文中描述的)换成了直接乘以0.5是什么原因呢,我通过实测发现直接乘以0.5比SE效果要更好 这是训练代码: python train.py --cfg models/SRyolo_MF.yaml --super --train_img_size 1024 --hr_input --data data/SRvedai.yaml --ch 64 --input_mode RGB+IR+MF --batch-size 2 代码的测试结果: image SE的测试结果: image 两次训练参数均保持一致

请问代码的测试的结果和乘以0.5的结果是改了哪里吗

代码里注释掉的部分跟原论文结构是相符合的,但作者将其更改为了乘以0.5

您好,我想在请教您一下,我找了很久没有发现作者在哪个地方调用了MF这个模块进行图像的融合,请问你有找到吗

models/SRyolo_MF.yaml 文件的 backbone

mhynbnb avatar Mar 07 '25 03:03 mhynbnb

关于论文中的表V在您提供的SRyolo_MF.yaml中并没有达到最好的效果,我应该如何修改文件以达到您论文中表V的最高表现

xilixilixilp avatar Aug 18 '25 04:08 xilixilixilp