txt2imghd Suggestion: include memory optimizations based on code from another fork

https://github.com/basujindal/stable-diffusion has a version of SD with optimized memory usage, for some 8GB VRAM card owners (such as myself) this can mean being able to generate 512x512 images (this script, as well as native SD code, crash with "out of memory" error). If it is possible to implement these or similar optimizations within this script, it would be highly appreciated.

Aug 25 '22 21:08 sarseev

Try it with half precision:

Add model.half() right after model = instantiate_from_config(config.model) and init_image = init_image.half() right after init_image = repeat(init_image, '1 ... -> b ...', b=batch_size).

Works on my 8GB GPU.

Aug 26 '22 01:08 user55050

I can't get it to work for the chunk part in half mode @nickRJ Solved, I had --strength to 1.0 :^)

Aug 26 '22 16:08 TDiffff

I rewrite the message since I have this error:

Traceback (most recent call last):
  File ".\scripts\txt2imghd.py", line 551, in <module>
    main()
  File ".\scripts\txt2imghd.py", line 366, in main
    text2img2(opt)
  File ".\scripts\txt2imghd.py", line 490, in text2img2
    init_latent = model.get_first_stage_encoding(model.encode_first_stage(init_image))  # move to latent space
  File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context    return func(*args, **kwargs)
  File "d:\stable_diffusion\stable-diffusion-main\ldm\models\diffusion\ddpm.py", line 863, in encode_first_stage
    return self.first_stage_model.encode(x)
  File "d:\stable_diffusion\stable-diffusion-main\ldm\models\autoencoder.py", line 325, in encode
    h = self.encoder(x)
  File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "d:\stable_diffusion\stable-diffusion-main\ldm\modules\diffusionmodules\model.py", line 439, in forward
    hs = [self.conv_in(x)]
  File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\nn\modules\conv.py", line 447, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\nn\modules\conv.py", line 443, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same

the command is: python .\scripts\txt2imghd.py --prompt "a photograph of an astronaut riding a horse" --strength=1.0 --ddim what am I missing?

Aug 29 '22 21:08 LuciferSam86

Got the same error, it was the indentation before init_image = init_image.half() for me.

Aug 30 '22 00:08 blacklisteddev

https://github.com/basujindal/stable-diffusion has a version of SD with optimized memory usage, for some 8GB VRAM card owners (such as myself) this can mean being able to generate 512x512 images (this script, as well as native SD code, crash with "out of memory" error). If it is possible to implement these or similar optimizations within this script, it would be highly appreciated.

The aforementioned fork allows generation of 512x512 images on 4GB vRAM cards, which should be the baseline to compare against imo.

Sep 15 '22 05:09 mbwgh

txt2imghd txt2imghd copied to clipboard

Suggestion: include memory optimizations based on code from another fork

txt2imghd
txt2imghd copied to clipboard