DiffSynth-Studio
DiffSynth-Studio copied to clipboard
Enjoy the magic of Diffusion models!
Hi,一个问题。 Qwen-Image 2509版本模型在推理默认将VAE图片缩放到1024*1024。 在微调LoRA时,输入图片尺寸不定。VAE尺寸仍然缩放到1024,而不是输入图片附近是否合适?
主要有2个错误: 1. 返回参数的数量不同,patchify返回x,而接受是x, grid_size(含f, h, w) 2. patchify返回的x的shape不正确
如题,有什么办法查看么
I trained with the default configuration of Flux Kontext full parameter training (2 H20), and the OOM occurred in less than a minute after the code was executed for training....
这种方式是先下载模型,对于已经下载好的模型路径无法使用这种写法 pipe = QwenImagePipeline.from_pretrained( torch_dtype=torch.bfloat16, device="cuda", model_configs=[ ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="transformer/diffusion_pytorch_model*.safetensors"), ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="text_encoder/model*.safetensors"), ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="vae/diffusion_pytorch_model.safetensors"), ], tokenizer_config=ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="tokenizer/"), ) 提供了ModelConfig(path="models/xxx.safetensors"),然而path一次只能加载一个文件,无法加载一个完整路径 还提供了local_model_path,这种用了报错 请问正确的完整的加载本地模型的代码是怎样的写法
DiffSynth is great. I am trying to train Qwen-Image-Edit-2509. I have a dataset of 5.4k examples. I'm curious if full parameter fine-tuning could be feasible with this dataset size, and...
比如我想修改人物的眼皮,那么背景、发型什么的,能不能有变化呢