CenterNet-plus icon indicating copy to clipboard operation
CenterNet-plus copied to clipboard

Using Grayscale (1 channel) images?

Open YashRunwal opened this issue 2 years ago • 5 comments

Hello,

great work! Thanks for sharing this with the community.

I would also like to make use of ResNet-18 architecture as the backbone and then use CenterNet architecture. However, I have a dataset of Grayscale images with shape [512, 1536]. So my question is:

  1. Can I use grayscale images for training?
  2. Apart from the first layer in_channels of the backbone, what else do I need to change?

Thank You.

YashRunwal avatar Jul 12 '21 12:07 YashRunwal

Yes, of course you can. You can use opencv to convert your Grayscale images into RGB-style images(with 3 channels).

yjh0410 avatar Jul 12 '21 13:07 yjh0410

@YashRunwal @yjh0410 I recommend you guys to read this issue for processing gray image.

developer0hye avatar Jul 12 '21 13:07 developer0hye

Wow, that was a quick reply @yjh0410 and @developer0hye :) Umm, No, I don't want to convert my Grayscale image to RGB. I want to use the 1 channel images for training.

For the pre-trained backbone I can sum the weights of the first layer in the first dimension, thus not losing any information and then I think I can just modify the 1st layer in_channels to 1, like below:

in_channels = 1
model.backbone.body.conv1.in_channels = in_channels
model.backbone.body.conv1.weight.data = model.backbone.body.conv1.weight.data.sum(dim=1, keepdim=True)

Note: This is just an example and the model used here is the pre-trained Faster RCNN from torchvision.

My question is, does this make sense for the centernet? :)

YashRunwal avatar Jul 12 '21 13:07 YashRunwal

Sorry ~

I am not sure whether it will work. I have never tried the method you introduced.

yjh0410 avatar Jul 12 '21 15:07 yjh0410

@yjh0410 No problem. I will try it out and post the results here. So please don't close this thread, might be helpful for someone else.

Also,

  1. I have changed the backbone resnet18 which suits my need. I have concatenated 2 images at a certain layer in the modified backbone and then want to use the dilated convs and write a decoder and then do the predictions,

I will post the results here but would need your help from time to time, if possible.

YashRunwal avatar Jul 13 '21 08:07 YashRunwal