latent-diffusion
latent-diffusion copied to clipboard
how to fine-tune on image with only one channel?
I assigned inchannels and outchannels to be 1 at config and convert weights from 2 dim to 1 dim but prediction results are always the same now.
Is there a way to convert pre-trained weights from 3 channels to 1 channel?
It might just be easier to start with the pretrained weights and channels, and instead of inputing single-channel greyscale, just duplicate the greyscale values over 3 channels. Then you can fine-tune over the pretrained weights without modification to the model.
It might just be easier to start with the pretrained weights and channels, and instead of inputing single-channel greyscale, just duplicate the greyscale values over 3 channels. Then you can fine-tune over the pretrained weights without modification to the model.
thank you for the advice. I have already run such training, but I hope using 1 channel could speed up training
It might just be easier to start with the pretrained weights and channels, and instead of inputing single-channel greyscale, just duplicate the greyscale values over 3 channels. Then you can fine-tune over the pretrained weights without modification to the model.
thank you for the advice. I have already run such training, but I hope using 1 channel could speed up training
To train the model for 1 channel, you need to go to the image_datasets.py file and comment out line 97: arr = np.array(pil_image.convert("RGB")). This line forces the input channel to be RGB.
Then add this 2 lines after the line you commented: arr = np.array(pil_image) arr = arr.reshape((self.resolution, self.resolution, 1))
Note that you won't be able to use the pretrained weight for this, but it should train faster.
What happen if I have images with 4 or five channels?
What happen if I have images with 4 or five channels?
It’s similar, I have trained a 5 channel diffusion model. Don’t load the data with Pillow. Use numpy array or any file format that will allow more than 3 channels. Remember to change the input channel in the Unet also.