Fooocus icon indicating copy to clipboard operation
Fooocus copied to clipboard

Bug: Describe feature can terminate image generation of other sessions

Open KKuehlem opened this issue 1 year ago • 3 comments

Read Troubleshoot

[x] I admit that I have read the Troubleshoot before making this issue.

Describe the problem Sometimes using the "Describe this image into Prompt" feature will terminate the image generation of of other sessions. I can reproduce this bug the using to sessions A and B.

  1. Session A starts image generation
  2. Session B starts image generation and the uses the "Describe this image into Prompt" (all while A still generates the images)

This should result in session B getting the prompt and then B starting to generate while As generation is terminated because of an error. B will generate the original prompt and not the one described from an image, as expected. Although the behavior in the steps to reproduce are not what a typical user might do since I encountered the error also in more realistic situations. While this bug can happen when a single user uses multiple sessions, doing the above steps this is also a reliable way to "steal" the first queue position from other users.

Full Console Log log.txt

KKuehlem avatar Dec 26 '23 18:12 KKuehlem

@Minekonst thank you for reporting, can confirm the issue.

It works fine when using --always-gpu.

After some debugging, it doesn't help to set the offload device (or also load_device) directly in https://github.com/lllyasviel/Fooocus/blob/main/extras/interrogate.py#L38-L39 to get_torch_device(), so i assume that when blip is done and offloaded, the image model is also moved to the offload device while still being needed in VRAM.

The debug logs in https://github.com/lllyasviel/Fooocus/blob/main/extras/interrogate.py#L50 and below also confirm this behavior:

        print('loading BLIP')
        model_management.load_model_gpu(self.blip_model)
        print('loaded BLIP')

        gpu_image = transforms.Compose([
            transforms.ToTensor(),
            transforms.Resize((blip_image_eval_size, blip_image_eval_size), interpolation=InterpolationMode.BICUBIC),
            transforms.Normalize((0.48145466, 0.4578275, 0.40821073), (0.26862954, 0.26130258, 0.27577711))
        ])(img_rgb).unsqueeze(0).to(device=self.load_device, dtype=self.dtype)

        print('captioning with BLIP')
        caption = self.blip_model.model.generate(gpu_image, sample=True, num_beams=1, max_length=75)[0]
        print('captioned with BLIP')
 20%|█████▏                    | 12/60 [00:05<00:19,  2.46it/s]load checkpoint from C:\Fooocus_win64_2-1-60\Fooocus\models\clip_vision\model_base_caption_capfilt_large.pth
 23%|██████                    | 14/60 [00:05<00:19,  2.37it/s]loading BLIP
Requested to load BLIP_Decoder
Loading 1 new model
loaded BLIP
captioning with BLIP
captioned with BLIP
 23%|██████                    | 14/60 [00:06<00:20,  2.22it/s]
Traceback (most recent call last):
  File "C:\Fooocus_win64_2-1-60\Fooocus\modules\async_worker.py", line 824, in worker
    handler(task)
...

A potential (temporary and permanent) solution would be to enable the queue for the describe button click action, so models don't collide. Still, it would be better to be able to process both in parallel (images and describe, if possible concerning ressources), to not let users wait until another already processing task is finished.

mashb1t avatar Dec 26 '23 20:12 mashb1t

@Minekonst fix in https://github.com/lllyasviel/Fooocus/pull/1608

mashb1t avatar Dec 26 '23 21:12 mashb1t

@mashb1t Wow, thanks for the very fast fix

KKuehlem avatar Dec 27 '23 08:12 KKuehlem

@Minekonst fix has just been implemented in https://github.com/lllyasviel/Fooocus/commit/aae5bba48830606f33a7013bc19fb3c8784c1dbc. Please close this issue.

mashb1t avatar Dec 28 '23 16:12 mashb1t