TransUNet > > Hello, it seems that the code currently only works on grayscale images. II am interested in processing images with 3 channels (RGB). Has anyone already modified the code accordingly? What do I have to pay attention to?

> > Hello, it seems that the code currently only works on grayscale images. II am interested in processing images with 3 channels (RGB). Has anyone already modified the code accordingly? What do I have to pay attention to?

Open lgc-china opened this issue 2 years ago • 3 comments

Hello, it seems that the code currently only works on grayscale images. II am interested in processing images with 3 channels (RGB). Has anyone already modified the code accordingly? What do I have to pay attention to?

@andife Hello, this repo also supports RGB image with 3 channels.

The network is original support 3 channels input (See line 386-387 in vit_seg_modeling.py): if x.size()[1] == 1: x = x.repeat(1,3,1,1)

@Beckschen I'm trying to use this model for RGB images. I removed the random rotations (they seemed buggy for RGB images), and instead now get an error on the lines you have mentioned (386-387 in vit_seg_modeling.py). The error is as follows: RuntimeError: Number of dimensions of repeat dims can not be smaller than number of dimensions of tensor

Originally posted by @aneeshgupta42 in https://github.com/Beckschen/TransUNet/issues/31#issuecomment-825068576

Mar 30 '22 15:03 lgc-china

can you tell me where did you get the dataset? There are many things I don't understand about data processing。 I don't know which dataset to download in this website https://www.synapse.org/#!Synapse:syn3193805/files/

Mar 30 '22 15:03 lgc-china

Hello, I had the same problem when running test.py, did you solve it?@lgc-china

Sep 09 '22 00:09 PatrickWilliams44

Hello, it seems that the code currently only works on grayscale images. II am interested in processing images with 3 channels (RGB). Has anyone already modified the code accordingly? What do I have to pay attention to?

@andife Hello, this repo also supports RGB image with 3 channels. The network is original support 3 channels input (See line 386-387 in vit_seg_modeling.py): if x.size()[1] == 1: x = x.repeat(1,3,1,1)

@Beckschen I'm trying to use this model for RGB images. I removed the random rotations (they seemed buggy for RGB images), and instead now get an error on the lines you have mentioned (386-387 in vit_seg_modeling.py). The error is as follows: RuntimeError: Number of dimensions of repeat dims can not be smaller than number of dimensions of tensor

Originally posted by @aneeshgupta42 in #31 (comment)

maybe RGB images dont need to repeat in channel dimension?because three channels itself!

Sep 19 '23 11:09 xinnvY

TransUNet TransUNet copied to clipboard

> > Hello, it seems that the code currently only works on grayscale images. II am interested in processing images with 3 channels (RGB). Has anyone already modified the code accordingly? What do I have to pay attention to?

TransUNet
TransUNet copied to clipboard