ICEdit icon indicating copy to clipboard operation
ICEdit copied to clipboard

a workflow for using this lora in comfyui

Open judian17 opened this issue 7 months ago • 44 comments

Thank you for this amazing project! Looking forward to the powerful model! I have set up a workflow in ComfyUI to use this LoRA, making it convenient for others to test it in ComfyUI.

icedit.json

judian17 avatar Apr 30 '25 18:04 judian17

Thank you so much for your kind words and the effort you put into setting up the workflow in ComfyUI! This is incredibly helpful for the community. I really appreciate your contribution. I'll definitely reference the icedit.json in the README to make it easier for others to utilize and test this LoRA.

River-Zhang avatar May 01 '25 04:05 River-Zhang

@River-Zhang I'm glad that my workflow could be helpful. However, today I discovered some potential issues. When loading this LoRA with ComfyUI, there are a lot of warnings saying "lora key not loaded." Although editing is still possible, the quality doesn't meet the level shown in the project demonstration. I think official support for ComfyUI is still necessary. Below are the prompts displayed when loading the LoRA and the resulting image output.

log.txt

Image

judian17 avatar May 01 '25 10:05 judian17

Thank you so much for bringing this to our attention and for your valuable feedback! We've actually already developed a ComfyUI integration internally. Right now, we're in the final stages of writing the documentation and conducting some last - minute debugging. We're aiming to release it very soon, so stay tuned! We appreciate your patience and look forward to resolving these issues with the upcoming release.

River-Zhang avatar May 01 '25 10:05 River-Zhang

I used AI to analyze the reasoning code of the project, and it seems that the "lora key not loaded" issue comes from the additional four expert layers introduced by the MOE structure, right? I applied a simple weighted average to these four layers using code provided by AI, which resulted in a model that appears to load normally. However, this model couldn't be used with the "nunchaku" project:https://github.com/mit-han-lab/ComfyUI-nunchaku

So, I first merged the LoRA with the flux fill model and then re-extracted it (this is the only simple way I know how), resulting in a LoRA that's compatible with nunchaku. From my testing, this approach performs slightly better than directly loading the original LoRA with the UNet loader — although it loses dynamic weighting, at least it works.

I've built a more complex workflow that includes adding high-definition refinement, and the current results feel quite good. Additionally, I found that combining this LoRA with Redux can achieve outfit changes to some extent.

Test images, the new workflow, and the weight fusion code are as follows — I hope they can help this project and others! (2025.05.07:Updated the workflow. Please note that currently, it directly loads the fill model using UNetLoader, and you can use it by simply chaining the normal LoRA. If you'd like to try Nunchaku, you'll need to use the processed icedit-normal_extracted_lora instead, but this comes with a noticeable drop in image quality.)

icedit-3.json

Image

average_moe_lora.zip

I like this project,waiting for the more powerful model

judian17 avatar May 02 '25 07:05 judian17

Thank you so much for your incredible efforts! Your work has significantly enhanced the integration of our model with the ComfyUI ecosystem, uncovering a wealth of new features and possibilities. Your innovative approach not only bridges the gap between our model and ComfyUI but also expands its functionality, making it more accessible and versatile for users.

We will definitely reference your work in our repository to highlight your achievements and make it easier for others to benefit from your insights. Once again, thank you for your dedication and for helping to push the boundaries of what our model can do within the ComfyUI environment!

River-Zhang avatar May 02 '25 09:05 River-Zhang

@judian17 can you share the AI generated code that you make to merge that lora to Flux Fill? Also… and what you use to extract the lora? (the comfyui lora extract node?)

I’m very Interested about this

YarvixPA avatar May 02 '25 16:05 YarvixPA

@YarvixPA This section of code was not generated by AI. I utilized code from ComfyUI-FluxTrainerhttps://github.com/kijai/ComfyUI-FluxTrainer. However, since there were some bugs in that project's code, I modified it by referring to comments in its issue discussions. Additionally, the original project was designed for flux training, with extracting lora being just one of its features. To avoid installing unnecessary dependencies, I extracted only the relevant code for lora extraction and integrated it into a ComfyUI node.It is here. Regarding the "comfyui lora extract node" you mentioned, the node can indeed be used as well, and the processing time of both methods is almost identical. However, for some unknown reason, the lora extracted using this node is not compatible with Nunchaku, so I still prefer to use my own method.

judian17 avatar May 02 '25 16:05 judian17

Image Well, I said that because HuggingFace mentioned it. It's not bad, but I wanted to see a bit of the code because I also got the error when applying a Lora to a Flux Fill in ComfyUI. How exactly was the Lora merged with the Flux Fill model?

YarvixPA avatar May 02 '25 16:05 YarvixPA

I apologize for my vague explanation earlier. To be precise, I used AI-generated code to perform a weighted average on the expert layers of the ICEdit Lora. The subsequent model merging and Lora extraction were done using the ComfyUI node from the aforementioned project. Of course, if you need the code for the weighted averaging, it is here:

average_moe_lora.zip

judian17 avatar May 02 '25 16:05 judian17

@judian17 Do you have a Discord where I can ask you questions? I'm trying to understand, but I have questions

YarvixPA avatar May 02 '25 17:05 YarvixPA

I noticed that the workflow posted above doesn't add the prefix to the instructions, like the original code does: https://github.com/River-Zhang/ICEdit/blob/a6355873ce8c3d47c0711172851abefa7801bd5e/scripts/inference.py#L52 For some editing tasks this doesn't seem to make much of a difference, but at least for some of my text editing tests it worked much better if I included the prefix.

IcyIntuition avatar May 02 '25 19:05 IcyIntuition

@YarvixPA Sorry, I'm just an amateur ComfyUI user who uses AI to analyze code, and I haven't even written any code myself. I'm afraid I can't provide valuable answers 🥲

judian17 avatar May 03 '25 01:05 judian17

@IcyIntuition Yes, this is very helpful for improving quality! Thank you for pointing that out.

judian17 avatar May 03 '25 01:05 judian17

@YarvixPA Sorry, I'm just an amateur ComfyUI user who uses AI to analyze code, and I haven't even written any code myself. I'm afraid I can't provide valuable answers 🥲

It's fine, I don't write code for myself either. I leave that job to the AI but I investigate what I do.

I just want to know a step by step. From what I understand you:

  1. Average the MOE lora
  2. Merge into flux fill
  3. Extract the lora

I’m correct?

YarvixPA avatar May 03 '25 12:05 YarvixPA

Thank you @judian17

I run the python script and fixed the lora file (https://huggingface.co/alexgenovese/loras/blob/main/FLUX/ICEdit-MoE-LoRA-Fixed.safetensors). Now, I have to merge into flux and extract the LoRA?

alexgenovese avatar May 03 '25 13:05 alexgenovese

@YarvixPA yes that what I did

judian17 avatar May 03 '25 13:05 judian17

@alexgenovese If you don't plan to use it together with nunchaku, then there is no need for fusion and re-extraction. I do this solely for compatibility with nunchaku, as it can provide nearly a 3x speed improvement.

judian17 avatar May 03 '25 13:05 judian17

@YarvixPA @alexgenovese In addition, this is my modified version of the code, which allows for more flexible weight adjustments. For example, running python weighted_average_moe_lora.py 5 15 30 50 means fusing experts 0 to 3 with weights of 5%, 15%, 30%, and 50% respectively. Maybe it is helpful.

weighted_average_moe_lora.zip

judian17 avatar May 03 '25 14:05 judian17

@alexgenovese I did a quick test on your LoRA, and it seems that the generated results appear slightly clearer compared to mine. Did you apply any additional fixes or techniques? If so, would you mind to share them?

judian17 avatar May 03 '25 16:05 judian17

@judian17 I run the LoRA model without nunchaku (icedit.json) and I got that issue...

About the clearer model: I only run your code (average_moe_lora.zip)

alexgenovese avatar May 03 '25 19:05 alexgenovese

@alexgenovese Well, then it might just be a coincidence, or perhaps there's a slight difference in the way we're using it.

judian17 avatar May 04 '25 04:05 judian17

Have you tried "concatenating" instead of "average" each expert?

YarvixPA avatar May 04 '25 08:05 YarvixPA

Thank you for this amazing project! Looking forward to the powerful model! I have set up a workflow in ComfyUI to use this LoRA, making it convenient for others to test it in ComfyUI.

icedit.json

@judian17 Would you please send a email to me? My email is [email protected], I want to add you to our contributor list and make some discussion together!

HorizonWind2004 avatar May 04 '25 11:05 HorizonWind2004

Have you tried "concatenating" instead of "average" each expert?

It seems that when using this LoRA, it dynamically selects some of the four experts and assigns certain weights. Therefore, I think "concatenating" won't be effective here.

Additionally, I came up with a new method for averaging today. The code is as follows:

The steps are:

  • Load the MoE-LoRA weights.
  • Calculate the average weight increment (mean(B_i @ A_i)) across all experts in each MoE layer.
  • Perform SVD decomposition on the averaged increments and take a low-rank approximation at the specified rank.
  • Reconstruct the new standard lora_A and lora_B matrices from the SVD results.
  • Save a new file containing these new standard LoRA weights along with other non-MoE weights.

This approach seems more aligned with the project's code logic. However, after a careful comparison, I noticed there’s not much difference between this method and directly loading the original LoRA. If you're only using a flux model with fp16 or fp8 precision, such averaging might not even be necessary.

Moreover, fusing and then extracting the LoRA clearly reduces quality. Unless you intend to use it together with "Nunchaku," this step may not be worth doing.

convert_moe_to_svd_lora.zip

judian17 avatar May 04 '25 12:05 judian17

Hi, thanks for your great contributions! We released the normal lora which is compatible with ComfyUI workflow. You can try this ckpt.

River-Zhang avatar May 04 '25 13:05 River-Zhang

@River-Zhang Thank you for your contribution! I'll try it out. By the way, this is my GitHub repository, where I've also shared a few modified versions that I've adjusted — feel free to share them with others for testing if needed! https://github.com/judian17/ICEdit/releases/tag/1.0

judian17 avatar May 04 '25 13:05 judian17

So is there a real good working comfy UI version of this yet or is it still just in testing? I would like to start playing around with it as well. Right now this just seems like it removes the background and replaces the background and allows you to change the color of clothes and stuff, but it's keeping the same pose of the person that limits its use quite a bit.

Will there be any updates that allow you to completely change the pose of the person and what they're doing? For example, like I load a picture of a headshot or image of someone and then I can get a picture of them, for example, riding a horse or doing anything that I describe? like what we can do in chat GPT that's what I'm doing now with ChatGPT and it's working very well . But I'd love to be able to do this on my local system in comfy Ui. Also can the images come out larger? On the test site the largest I've been able to get is 512 by 768. ? With chat GPT, I'm getting outputs of 1536 by 1024. Then with a little upsizing I could basically print posters from any of the images.

Goactivemedia avatar May 04 '25 19:05 Goactivemedia

Can this be run without using Flux 1 so it can be used for commercial use? Otherwise, it can't be used within our business for anything. ?

Goactivemedia avatar May 04 '25 19:05 Goactivemedia

So is there a real good working comfy UI version of this yet or is it still just in testing? I would like to start playing around with it as well. Right now this just seems like it removes the background and replaces the background and allows you to change the color of clothes and stuff, but it's keeping the same pose of the person that limits its use quite a bit.

Will there be any updates that allow you to completely change the pose of the person and what they're doing? For example, like I load a picture of a headshot or image of someone and then I can get a picture of them, for example, riding a horse or doing anything that I describe? like what we can do in chat GPT that's what I'm doing now with ChatGPT and it's working very well . But I'd love to be able to do this on my local system in comfy Ui. Also can the images come out larger? On the test site the largest I've been able to get is 512 by 768. ? With chat GPT, I'm getting outputs of 1536 by 1024. Then with a little upsizing I could basically print posters from any of the images.

The current model, due to its MoE structure, may experience a slight loss in quality when used directly in ComfyUI. Perhaps the "normal LoRA" I posted above could help, but it still seems to have a small gap compared to the MoE LoRA. Moreover, there appears to be no way to avoid image blurring when using the current model in ComfyUI, and it also doesn't perform very well for anime-style transfers. Hopefully, future stronger models will be able to address these issues.

As for your idea, I think what you're looking for are functionalities like those found in InfiniteYou, UNO, InstantCharacter , Ace plus , etc., which offer excellent consistency. ICEEdit, being designed for editing purposes, only modifies specific areas of an image rather than generating entirely new ones.

Regarding commercial use, my workflow is completely usable. Unfortunately, though, this LoRA likely needs to be used together with the fill model, and Black Forest Labs has restricted its commercial usage.

judian17 avatar May 05 '25 01:05 judian17

What's the min vram for this? Hopefully it's less than the 24g required by the standalone version.

yuesam avatar May 05 '25 04:05 yuesam