txt2imghd
txt2imghd copied to clipboard
Suggestion: include memory optimizations based on code from another fork
https://github.com/basujindal/stable-diffusion has a version of SD with optimized memory usage, for some 8GB VRAM card owners (such as myself) this can mean being able to generate 512x512 images (this script, as well as native SD code, crash with "out of memory" error). If it is possible to implement these or similar optimizations within this script, it would be highly appreciated.
Try it with half precision:
Add model.half()
right after model = instantiate_from_config(config.model)
and init_image = init_image.half()
right after init_image = repeat(init_image, '1 ... -> b ...', b=batch_size)
.
Works on my 8GB GPU.
I can't get it to work for the chunk part in half mode @nickRJ Solved, I had --strength to 1.0 :^)
I rewrite the message since I have this error:
Traceback (most recent call last):
File ".\scripts\txt2imghd.py", line 551, in <module>
main()
File ".\scripts\txt2imghd.py", line 366, in main
text2img2(opt)
File ".\scripts\txt2imghd.py", line 490, in text2img2
init_latent = model.get_first_stage_encoding(model.encode_first_stage(init_image)) # move to latent space
File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs)
File "d:\stable_diffusion\stable-diffusion-main\ldm\models\diffusion\ddpm.py", line 863, in encode_first_stage
return self.first_stage_model.encode(x)
File "d:\stable_diffusion\stable-diffusion-main\ldm\models\autoencoder.py", line 325, in encode
h = self.encoder(x)
File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "d:\stable_diffusion\stable-diffusion-main\ldm\modules\diffusionmodules\model.py", line 439, in forward
hs = [self.conv_in(x)]
File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\nn\modules\conv.py", line 447, in forward
return self._conv_forward(input, self.weight, self.bias)
File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\nn\modules\conv.py", line 443, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same
the command is: python .\scripts\txt2imghd.py --prompt "a photograph of an astronaut riding a horse" --strength=1.0 --ddim what am I missing?
Got the same error, it was the indentation before init_image = init_image.half()
for me.
https://github.com/basujindal/stable-diffusion has a version of SD with optimized memory usage, for some 8GB VRAM card owners (such as myself) this can mean being able to generate 512x512 images (this script, as well as native SD code, crash with "out of memory" error). If it is possible to implement these or similar optimizations within this script, it would be highly appreciated.
The aforementioned fork allows generation of 512x512 images on 4GB vRAM cards, which should be the baseline to compare against imo.