stable-diffusion-webui [Bug]: Interrogate CLIP broken

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What happened?

I use Interrogate CLIP in the img2img to generate descriptions of image with a RTX3070Laptop videocard, which has 8GB video memories, and end in "CUDA out of memory" error, after use extension "depthmap2mask" by @Extraltodeus . I ensure it can work perfectly before running the extension, and the CLIP seems only working half, which means it already produces some tags, but doesn't giveout the Artist with replacement error.

Steps to reproduce the problem

Go to webui-user.bat and start it
Open the WEBUI
Drag an image to i2i
Run the extension script "depthmap2mask" until it has done
Press the Interrogate CLIP button
It gives out some descriptions (without artist)
oops! CUDA is out of memory

What should have happened?

It should give out complete descriptions, including artist. And the extension "depthmap2mask" by @Extraltodeus should release its vram.

Commit where the problem happens

ce049c471b4a1d22f5a8fe8f527788edcf934eda

What platforms do you use to access UI ?

Windows

What browsers do you use to access the UI ?

Microsoft Edge

Command Line Arguments

load checkpoint from https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_caption_capfilt_large.pth
Error interrogating
Traceback (most recent call last):
  File "D:\AI6\stable-diffusion-webui\modules\interrogate.py", line 157, in interrogate
    artist = self.rank(image_features, ["by " + artist.name for artist in shared.artist_db.artists])[0]
  File "D:\AI6\stable-diffusion-webui\modules\interrogate.py", line 109, in rank
    text_features = self.clip_model.encode_text(text_tokens).type(self.dtype)
  File "D:\AI6\stable-diffusion-webui\venv\lib\site-packages\clip\model.py", line 348, in encode_text
    x = self.transformer(x)
  File "D:\AI6\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\AI6\stable-diffusion-webui\venv\lib\site-packages\clip\model.py", line 203, in forward
    return self.resblocks(x)
  File "D:\AI6\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\AI6\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\container.py", line 139, in forward
    input = module(input)
  File "D:\AI6\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\AI6\stable-diffusion-webui\venv\lib\site-packages\clip\model.py", line 191, in forward
    x = x + self.mlp(self.ln_2(x))
  File "D:\AI6\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\AI6\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\container.py", line 139, in forward
    input = module(input)
  File "D:\AI6\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\AI6\stable-diffusion-webui\venv\lib\site-packages\clip\model.py", line 168, in forward
    return x * torch.sigmoid(1.702 * x)
RuntimeError: CUDA out of memory. Tried to allocate 678.00 MiB (GPU 0; 8.00 GiB total capacity; 5.85 GiB already allocated; 0 bytes free; 6.22 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Additional information, context and logs

Here is the screenshots: before using "depthmap2mask" 网页捕获_4-12-2022_121257_127 0 0 1 after using "depthmap2mask" 网页捕获_4-12-2022_12849_127 0 0 1

Dec 04 '22 04:12 LieDeath

Just wanted to bump this, I get a different error, not related to memory I think.

Error interrogating
Traceback (most recent call last):
  File "/content/gdrive/.shortcut-targets-by-id/1in9dwXQDzAxLTNEepIKC2q3BCFIrQWEu/sd/stable-diffusion-webui/modules/interrogate.py", line 148, in interrogate
    caption = self.generate_caption(pil_image)
  File "/content/gdrive/.shortcut-targets-by-id/1in9dwXQDzAxLTNEepIKC2q3BCFIrQWEu/sd/stable-diffusion-webui/modules/interrogate.py", line 133, in generate_caption
    caption = self.blip_model.generate(gpu_image, sample=False, num_beams=shared.opts.interrogate_clip_num_beams, min_length=shared.opts.interrogate_clip_min_length, max_length=shared.opts.interrogate_clip_max_length)
  File "/content/gdrive/MyDrive/sd/stablediffusion/src/blip/models/blip.py", line 156, in generate
    outputs = self.text_decoder.generate(input_ids=input_ids,
  File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/generation_utils.py", line 1268, in generate
    self._validate_model_kwargs(model_kwargs.copy())
  File "/usr/local/lib/python3.8/dist-packages/transformers/generation_utils.py", line 964, in _validate_model_kwargs
    raise ValueError(
ValueError: The following `model_kwargs` are not used by the model: ['encoder_hidden_states', 'encoder_attention_mask'] (note: typos in the generate arguments will also show up in this list)

Traceback (most recent call last):
  File "/content/gdrive/.shortcut-targets-by-id/1in9dwXQDzAxLTNEepIKC2q3BCFIrQWEu/sd/stable-diffusion-webui/modules/interrogate.py", line 148, in interrogate
    caption = self.generate_caption(pil_image)
  File "/content/gdrive/.shortcut-targets-by-id/1in9dwXQDzAxLTNEepIKC2q3BCFIrQWEu/sd/stable-diffusion-webui/modules/interrogate.py", line 133, in generate_caption
    caption = self.blip_model.generate(gpu_image, sample=False, num_beams=shared.opts.interrogate_clip_num_beams, min_length=shared.opts.interrogate_clip_min_length, max_length=shared.opts.interrogate_clip_max_length)
  File "/content/gdrive/MyDrive/sd/stablediffusion/src/blip/models/blip.py", line 156, in generate
    outputs = self.text_decoder.generate(input_ids=input_ids,
  File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/generation_utils.py", line 1268, in generate
    self._validate_model_kwargs(model_kwargs.copy())
  File "/usr/local/lib/python3.8/dist-packages/transformers/generation_utils.py", line 964, in _validate_model_kwargs
    raise ValueError(
ValueError: The following `model_kwargs` are not used by the model: ['encoder_hidden_states', 'encoder_attention_mask'] (note: typos in the generate arguments will also show up in this list)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/gradio/routes.py", line 284, in run_predict
    output = await app.blocks.process_api(
  File "/usr/local/lib/python3.8/dist-packages/gradio/blocks.py", line 982, in process_api
    result = await self.call_function(fn_index, inputs, iterator)
  File "/usr/local/lib/python3.8/dist-packages/gradio/blocks.py", line 824, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.8/dist-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.8/dist-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.8/dist-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/content/gdrive/.shortcut-targets-by-id/1in9dwXQDzAxLTNEepIKC2q3BCFIrQWEu/sd/stable-diffusion-webui/modules/ui.py", line 269, in interrogate
    prompt = shared.interrogator.interrogate(image)
  File "/content/gdrive/.shortcut-targets-by-id/1in9dwXQDzAxLTNEepIKC2q3BCFIrQWEu/sd/stable-diffusion-webui/modules/interrogate.py", line 177, in interrogate
    res += "<error>"
TypeError: unsupported operand type(s) for +=: 'NoneType' and 'str'

Dec 10 '22 17:12 createperhaps

Just wanted to bump this, I get a different error, not related to memory I think.

Error interrogating
Traceback (most recent call last):
  File "/content/gdrive/.shortcut-targets-by-id/1in9dwXQDzAxLTNEepIKC2q3BCFIrQWEu/sd/stable-diffusion-webui/modules/interrogate.py", line 148, in interrogate
    caption = self.generate_caption(pil_image)
  File "/content/gdrive/.shortcut-targets-by-id/1in9dwXQDzAxLTNEepIKC2q3BCFIrQWEu/sd/stable-diffusion-webui/modules/interrogate.py", line 133, in generate_caption
    caption = self.blip_model.generate(gpu_image, sample=False, num_beams=shared.opts.interrogate_clip_num_beams, min_length=shared.opts.interrogate_clip_min_length, max_length=shared.opts.interrogate_clip_max_length)
  File "/content/gdrive/MyDrive/sd/stablediffusion/src/blip/models/blip.py", line 156, in generate
    outputs = self.text_decoder.generate(input_ids=input_ids,
  File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/generation_utils.py", line 1268, in generate
    self._validate_model_kwargs(model_kwargs.copy())
  File "/usr/local/lib/python3.8/dist-packages/transformers/generation_utils.py", line 964, in _validate_model_kwargs
    raise ValueError(
ValueError: The following `model_kwargs` are not used by the model: ['encoder_hidden_states', 'encoder_attention_mask'] (note: typos in the generate arguments will also show up in this list)

Traceback (most recent call last):
  File "/content/gdrive/.shortcut-targets-by-id/1in9dwXQDzAxLTNEepIKC2q3BCFIrQWEu/sd/stable-diffusion-webui/modules/interrogate.py", line 148, in interrogate
    caption = self.generate_caption(pil_image)
  File "/content/gdrive/.shortcut-targets-by-id/1in9dwXQDzAxLTNEepIKC2q3BCFIrQWEu/sd/stable-diffusion-webui/modules/interrogate.py", line 133, in generate_caption
    caption = self.blip_model.generate(gpu_image, sample=False, num_beams=shared.opts.interrogate_clip_num_beams, min_length=shared.opts.interrogate_clip_min_length, max_length=shared.opts.interrogate_clip_max_length)
  File "/content/gdrive/MyDrive/sd/stablediffusion/src/blip/models/blip.py", line 156, in generate
    outputs = self.text_decoder.generate(input_ids=input_ids,
  File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/generation_utils.py", line 1268, in generate
    self._validate_model_kwargs(model_kwargs.copy())
  File "/usr/local/lib/python3.8/dist-packages/transformers/generation_utils.py", line 964, in _validate_model_kwargs
    raise ValueError(
ValueError: The following `model_kwargs` are not used by the model: ['encoder_hidden_states', 'encoder_attention_mask'] (note: typos in the generate arguments will also show up in this list)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/gradio/routes.py", line 284, in run_predict
    output = await app.blocks.process_api(
  File "/usr/local/lib/python3.8/dist-packages/gradio/blocks.py", line 982, in process_api
    result = await self.call_function(fn_index, inputs, iterator)
  File "/usr/local/lib/python3.8/dist-packages/gradio/blocks.py", line 824, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.8/dist-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.8/dist-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.8/dist-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/content/gdrive/.shortcut-targets-by-id/1in9dwXQDzAxLTNEepIKC2q3BCFIrQWEu/sd/stable-diffusion-webui/modules/ui.py", line 269, in interrogate
    prompt = shared.interrogator.interrogate(image)
  File "/content/gdrive/.shortcut-targets-by-id/1in9dwXQDzAxLTNEepIKC2q3BCFIrQWEu/sd/stable-diffusion-webui/modules/interrogate.py", line 177, in interrogate
    res += "<error>"
TypeError: unsupported operand type(s) for +=: 'NoneType' and 'str'

It seems you run webui on Google Colab, which gives a 16GB Tesla T4 calculate card with much older Python environment. Will it cause difference? I am not sure.

Dec 11 '22 05:12 LieDeath

Yes I’m using the last bens implementation. Didn’t consider it would be the colab. I see

Dec 11 '22 12:12 createperhaps

Yes I’m using the last bens implementation. Didn’t consider it would be the colab. I see

😂I am not sure what you means. Yet I misunderstand about your running environment.

Dec 11 '22 14:12 LieDeath

stable-diffusion-webui stable-diffusion-webui copied to clipboard

[Bug]: Interrogate CLIP broken

Is there an existing issue for this?

What happened?

Steps to reproduce the problem

What should have happened?

Commit where the problem happens

What platforms do you use to access UI ?

What browsers do you use to access the UI ?

Command Line Arguments

Additional information, context and logs

stable-diffusion-webui
stable-diffusion-webui copied to clipboard