latent-diffusion icon indicating copy to clipboard operation
latent-diffusion copied to clipboard

how to fine-tune on image with only one channel?

Open aleksmirosh opened this issue 2 years ago • 6 comments

I assigned inchannels and outchannels to be 1 at config and convert weights from 2 dim to 1 dim but prediction results are always the same now.

Is there a way to convert pre-trained weights from 3 channels to 1 channel?

aleksmirosh avatar Feb 24 '23 03:02 aleksmirosh

It might just be easier to start with the pretrained weights and channels, and instead of inputing single-channel greyscale, just duplicate the greyscale values over 3 channels. Then you can fine-tune over the pretrained weights without modification to the model.

akristoffersen avatar Mar 16 '23 20:03 akristoffersen

It might just be easier to start with the pretrained weights and channels, and instead of inputing single-channel greyscale, just duplicate the greyscale values over 3 channels. Then you can fine-tune over the pretrained weights without modification to the model.

thank you for the advice. I have already run such training, but I hope using 1 channel could speed up training

aleksmirosh avatar Mar 20 '23 16:03 aleksmirosh

It might just be easier to start with the pretrained weights and channels, and instead of inputing single-channel greyscale, just duplicate the greyscale values over 3 channels. Then you can fine-tune over the pretrained weights without modification to the model.

thank you for the advice. I have already run such training, but I hope using 1 channel could speed up training

To train the model for 1 channel, you need to go to the image_datasets.py file and comment out line 97: arr = np.array(pil_image.convert("RGB")). This line forces the input channel to be RGB.

Then add this 2 lines after the line you commented: arr = np.array(pil_image) arr = arr.reshape((self.resolution, self.resolution, 1))

Note that you won't be able to use the pretrained weight for this, but it should train faster.

tobi-ore avatar Jan 10 '24 18:01 tobi-ore

What happen if I have images with 4 or five channels?

eapolo avatar May 08 '24 14:05 eapolo

What happen if I have images with 4 or five channels?

It’s similar, I have trained a 5 channel diffusion model. Don’t load the data with Pillow. Use numpy array or any file format that will allow more than 3 channels. Remember to change the input channel in the Unet also.

tobi-ore avatar May 08 '24 15:05 tobi-ore