stable-diffusion
stable-diffusion copied to clipboard
img2img.py issue
With current pushed code I get the following error:
- This IS expected if you are initializing CLIPTextModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing CLIPTextModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
loaded input image of size (720, 960) from /home/chris/Pictures/test/b.jpg
scripts/img2img.py:53: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
image = image.resize((w, h), resample=PIL.Image.LANCZOS)
Traceback (most recent call last):
File "scripts/img2img.py", line 297, in
main() File "scripts/img2img.py", line 239, in main init_latent = model.get_first_stage_encoding(model.encode_first_stage(init_image)) # move to latent space File "/home/chris/anaconda3/envs/dogg/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/chris/stable2/dogg/stable-diffusion/ldm/models/diffusion/ddpm.py", line 863, in encode_first_stage return self.first_stage_model.encode(x) File "/home/chris/stable2/dogg/stable-diffusion/ldm/models/autoencoder.py", line 325, in encode h = self.encoder(x) File "/home/chris/anaconda3/envs/dogg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/home/chris/stable2/dogg/stable-diffusion/ldm/modules/diffusionmodules/model.py", line 492, in forward hs = [self.conv_in(x)] File "/home/chris/anaconda3/envs/dogg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/home/chris/anaconda3/envs/dogg/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 447, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/chris/anaconda3/envs/dogg/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 443, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same
Getting same issue I can get rid of the error by removing references to model half - but this results in me only being able to go as high as 1024x1024 on img2img mode. Attempting to go to 2048x2048 - results in out of memory , but the model half issue prevents me from getting around that. Would love a solution if you get a chance.
@baaleos How did you remove references to 'model half'?
Commented out line 200 and 204 of img2img.py but I think in the end I just kept it in - not sure what resolved it. In the end I just have to make sure that the input image is of a standard size I typically work in 512x512 by default, but I am able to get it working at 768x768 as well as 512x1024 and 768x1024 etc
FWIW, I got the mountain example to run by modifying this line: https://github.com/Doggettx/stable-diffusion/blob/ab0bff6bc08c0ac55e08c596c999e5e5e0a7c111/scripts/img2img.py#L54 to make it load the input image as float16 instead of float32.