sd-scripts
sd-scripts copied to clipboard
Have any guide for training flux Fill lora, or fine-turn flux Fill?
Have any guide for training flux Fill lora, or fine-turn flux Fill?
It's not exactly an answer to your question, but I've noticed that when I use Flux Fill in ComfyUI, I can apply the LoRA I trained with sd-scripts to it, and it worked okay. I'm guessing Flux Fill is similar enough to Flux Dev that LoRAs generally work for both of them.
这并非您问题的确切答案,但我注意到,当我在 ComfyUI 中使用 Flux Fill 时,我可以将使用 sd-scripts 训练的 LoRA 应用于它,并且效果很好。我猜 Flux Fill 与 Flux Dev 足够相似,LoRA 通常适用于它们两者。
thanks, i want train a lora only for improve text remove
It's not exactly an answer to your question, but I've noticed that when I use Flux Fill in ComfyUI, I can apply the LoRA I trained with sd-scripts to it, and it worked okay. I'm guessing Flux Fill is similar enough to Flux Dev that LoRAs generally work for both of them.
are you sure? what kind of lora are you using, character/style or something else? flux fill is full finetune from dev and have a different input emb/input linear layer. is possible that dev lora somehow works for fill as well. but it's better if we develop a way to train lora with fill.
I have the same question. Did you solve it? @CaiNiaoCYY
I saw this issue, and also want to understand difference between flux, flux fill and flux kontext model. So I dive into the code of diffusers.
A short answer is that flux, flux fill, flux kontext all have the same model structure, but there are differences in denoising process, which means the training code should be different.
structures of flux, flux fill, flux kontext are all using FluxTransformer2DModel: https://github.com/huggingface/diffusers/blob/3d2f8ae99b88c50a340dd879ec7f44981ffd12fe/src/diffusers/pipelines/flux/pipeline_flux.py#L159 https://github.com/huggingface/diffusers/blob/3d2f8ae99b88c50a340dd879ec7f44981ffd12fe/src/diffusers/pipelines/flux/pipeline_flux_fill.py#L179 https://github.com/huggingface/diffusers/blob/3d2f8ae99b88c50a340dd879ec7f44981ffd12fe/src/diffusers/pipelines/flux/pipeline_flux_kontext.py#L190
At inference time, fill model concat masked_image_latents into the input of transformer. Also, for kontext, it concat image_latents into latent. https://github.com/huggingface/diffusers/blob/3d2f8ae99b88c50a340dd879ec7f44981ffd12fe/src/diffusers/pipelines/flux/pipeline_flux.py#L919 https://github.com/huggingface/diffusers/blob/3d2f8ae99b88c50a340dd879ec7f44981ffd12fe/src/diffusers/pipelines/flux/pipeline_flux_fill.py#L984 https://github.com/huggingface/diffusers/blob/3d2f8ae99b88c50a340dd879ec7f44981ffd12fe/src/diffusers/pipelines/flux/pipeline_flux_kontext.py#L1060