ldtgodlike
ldtgodlike
我觉得max_steps太少了,这货不是epoch,根本过不了一轮数据
@hjing100 我也遇到loss 为0,而且推理还出现了RuntimeError: probability tensor contains either `inf`, `nan` or element < 0,我看一些issue说是代码和权重版本不一样引起的,大兄弟你解决了没
> Refer to [huggingface/accelerate#2787](https://github.com/huggingface/accelerate/issues/2787) to get an idea of the adjustments needed to make it work. `if isinstance(unwrap_model(model), type(unwrap_model(transformer)))` inplace `if isinstance(model, type(unwrap_model(transformer))) ` can save the checkpoint。 However, this...
it seems as if no --train_text_encoder found in: [diffusers/examples/dreambooth/test_dreambooth_flux.py](https://github.com/huggingface/diffusers/blob/b9e2f886cd6e9182f1bf1bf7421c6363956f94c5/examples/dreambooth/test_dreambooth_flux.py#L65) my script as follow: ``` accelerate launch train_dreambooth_lora_flux.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --instance_data_dir=$INSTANCE_DIR \ --output_dir=$OUTPUT_DIR \ --mixed_precision="bf16" \ --instance_prompt="bedroom, YF_CN style" \...
> Okay so, it fails for `train_text_encoder` or does it fail without `train_text_encoder` as well? only fails for train_text_encoder
errors caused by accelerate and deepspeed like 
> Okay that is helpful. > > The error you posted in [#9393 (comment)](https://github.com/huggingface/diffusers/issues/9393#issuecomment-2342750651), seems easy to solve. We should just filter out the "module" keys in the state dict...