Álvaro Somoza

Results 91 comments of Álvaro Somoza

I'm curious about why do you want to do this instead of using already done solutions like `SimpleTuner` or `sd-scripts`, and if you want a gui you can use `OneTrainer`....

Thanks for your explanation, I appreciate it and it helps me understand and take note for the future. As a side note, so that you know, this is your specific...

I think you're expecting too much of this, most of the examples I see aren't that great and yours seems like most of the examples, probably you need to keep...

it is called controlnet tile but it actually doesn't do anything related to tiles, it just add details, change them or fixes blurry images as you can see in my...

Since the original user didn't post any more questions, we can assume yes.

I'm testing it with controlnet and If I enable `guess_mode` I get this error: ```python down_block_res_sample = down_block_res_sample + down_block_additional_residual ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ RuntimeError: The size of tensor a (3) must match...

I changed the `output_type` to "np" instead of "latents" but this means I have to validate against a numpy array that has been decoded by the vae. This value changes...

PAG keeps surprising me. I tested it with ControlNet and even without `guess_mode`, the difference is really impressive. |preprocessed|without pag|with pag| |---|---|---| |![20240613161704](https://github.com/huggingface/diffusers/assets/5442875/a44b4ce1-98d3-4443-a6c4-27222e1e1aa8)|![20240613162602_1471724984](https://github.com/huggingface/diffusers/assets/5442875/1727ccd7-2945-44aa-bc74-54e90325a1cb)|![20240613163558_1471724984](https://github.com/huggingface/diffusers/assets/5442875/865b1a48-31a2-4f37-a937-187b3aa54eb8)| @yiyixuxu can you add also the img2img...

When I tried to use pag with guess mode I got this error now: ``` models/controlnet.py", line 791, in forward add_embeds = torch.concat([text_embeds, time_embeds], dim=-1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Expected all tensors...

> should we remove the guess_mode IMO yes, if not, it could give the users the wrong impression that it works with PAG.