ai-toolkit icon indicating copy to clipboard operation
ai-toolkit copied to clipboard

KeyError: 'pixel_values' in Qwen-Image-Edit-2509

Open xilai0715 opened this issue 3 months ago • 29 comments

Thank you very much for your work. I encountered this issue while training related to Qwen-Image-Edit-2509. Image Initially, I suspected it was a problem with the dataset, but the same dataset works for training on knotext. Additionally, I tested using the same dataset for both the control dataset and the target dataset to rule out any influence, but the same bug still returned. Looking forward to your reply.

xilai0715 avatar Sep 28 '25 03:09 xilai0715

I ran into the same issue for training a Qwen-Image-Edit-2509 LoRA.

My dataset has only one set of control images. (One control image per target image) I used the same dataset to successfully training a Qwen-Image-Edit LoRA.

AI-Toolkit is updated to the latest state of the main branch, and all dependencies have been updated.

...
1248x1728: 777 files
1728x1248: 117 files
1248x1824: 301 files
1824x1248: 111 files
4 buckets made
Generating baseline samples before training
Error running job:

========================================
Result:
 - 0 completed jobs
 - 1 failure
========================================
Traceback (most recent call last):
  File "D:\ai\LoRa-Trainer\ai-toolkit\venv\Lib\site-packages\transformers\feature_extraction_utils.py", line 92, in __getattr__
    return self.data[item]
           ~~~~~~~~~^^^^^^
KeyError: 'pixel_values'

JanGerritsen avatar Sep 28 '25 15:09 JanGerritsen

Thanks! The pr resolve the issue

codexq123 avatar Sep 28 '25 15:09 codexq123

I added https://github.com/ojasaar/ai-toolkit.git as a second remote repository and did a checkout of fix-qwen-multi-control

This time the job continued, and the initial set of the sample images were created, but when the actual training started the job broke with the message: Error running job: tuple index out of range

1248x1728: 777 files
1728x1248: 117 files
1248x1824: 301 files
1824x1248: 111 files
4 buckets made
Generating baseline samples before training
Qwen-Image-Edit-2509-LoRA-v1:   0%|                                                                                                                                                                             | 0/10000 [00:00<?, ?it/s]Error running job: tuple index out of range

========================================
Result:
 - 0 completed jobs
 - 1 failure
========================================
Traceback (most recent call last):
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 120, in <module>
    main()
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 108, in main
    raise e
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 96, in main
    job.run()
  File "D:\ai\LoRa-Trainer\ai-toolkit\jobs\ExtensionJob.py", line 22, in run
    process.run()
  File "D:\ai\LoRa-Trainer\ai-toolkit\jobs\process\BaseSDTrainProcess.py", line 2157, in run
    loss_dict = self.hook_train_loop(batch_list)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\sd_trainer\SDTrainer.py", line 2039, in hook_train_loop
    loss = self.train_single_accumulation(batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\sd_trainer\SDTrainer.py", line 1565, in train_single_accumulation
    conditional_embeds = self.sd.encode_prompt(
                         ^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\toolkit\models\base_model.py", line 1049, in encode_prompt
    return self.get_prompt_embeds(prompt, control_images=control_images)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\diffusion_models\qwen_image\qwen_image_edit_plus.py", line 168, in get_prompt_embeds
    ratio = control_images[i].shape[2] / control_images[i].shape[3]
                                         ~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: tuple index out of range
Qwen-Image-Edit-2509-LoRA-v1:   0%|  

I don't know if the new error is related.

JanGerritsen avatar Sep 28 '25 19:09 JanGerritsen

同样遇到的问题,请问应该如何解决: KeyError: 'pixel_values' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/app/ai-toolkit/run.py", line 120, in main() File "/app/ai-toolkit/run.py", line 108, in main raise e File "/app/ai-toolkit/run.py", line 96, in main job.run() File "/app/ai-toolkit/jobs/ExtensionJob.py", line 22, in run process.run() File "/app/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 1996, in run self.sample(self.step_num) File "/app/ai-toolkit/extensions_built_in/sd_trainer/DiffusionTrainer.py", line 288, in sample super().sample(step, is_first) File "/app/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 369, in sample self.sd.generate_images(gen_img_config_list, sampler=sample_config.sampler) File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 120, in decorate_context return func(*args, **kwargs) File "/app/ai-toolkit/toolkit/models/base_model.py", line 521, in generate_images conditional_embeds = self.encode_prompt( File "/app/ai-toolkit/toolkit/models/base_model.py", line 1029, in encode_prompt return self.get_prompt_embeds(prompt, control_images=control_images) File "/app/ai-toolkit/extensions_built_in/diffusion_models/qwen_image/qwen_image_edit_plus.py", line 179, in get_prompt_embeds prompt_embeds, prompt_embeds_mask = self.pipeline.encode_prompt( File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 318, in encode_prompt prompt_embeds, prompt_embeds_mask = self._get_qwen_prompt_embeds(prompt, image, device) File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 265, in _get_qwen_prompt_embeds pixel_values=model_inputs.pixel_values, File "/usr/local/lib/python3.10/dist-packages/transformers/feature_extraction_utils.py", line 94, in getattr raise AttributeError AttributeError Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/transformers/feature_extraction_utils.py", line 92, in getattr return self.data[item] KeyError: 'pixel_values' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/app/ai-toolkit/run.py", line 120, in main() File "/app/ai-toolkit/run.py", line 108, in main raise e File "/app/ai-toolkit/run.py", line 96, in main job.run() File "/app/ai-toolkit/jobs/ExtensionJob.py", line 22, in run process.run() File "/app/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 1996, in run self.sample(self.step_num) File "/app/ai-toolkit/extensions_built_in/sd_trainer/DiffusionTrainer.py", line 288, in sample super().sample(step, is_first) File "/app/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 369, in sample self.sd.generate_images(gen_img_config_list, sampler=sample_config.sampler) File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 120, in decorate_context return func(*args, **kwargs) File "/app/ai-toolkit/toolkit/models/base_model.py", line 521, in generate_images conditional_embeds = self.encode_prompt( File "/app/ai-toolkit/toolkit/models/base_model.py", line 1029, in encode_prompt return self.get_prompt_embeds(prompt, control_images=control_images) File "/app/ai-toolkit/extensions_built_in/diffusion_models/qwen_image/qwen_image_edit_plus.py", line 179, in get_prompt_embeds prompt_embeds, prompt_embeds_mask = self.pipeline.encode_prompt( File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 318, in encode_prompt prompt_embeds, prompt_embeds_mask = self._get_qwen_prompt_embeds(prompt, image, device) File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 265, in _get_qwen_prompt_embeds pixel_values=model_inputs.pixel_values, File "/usr/local/lib/python3.10/dist-packages/transformers/feature_extraction_utils.py", line 94, in getattr raise AttributeError AttributeError

GPT回复: 所以总结一下:

根本原因:ai-toolkit 在 baseline 采样阶段调用 Qwen-Image pipeline 时,取不到 pixel_values。

最可能的两个原因:

传入的数据(image)没有先经过 Qwen-Image 的 processor;

你当前安装的 transformers/diffusers 版本和 ai-toolkit 的预期不一致,字段名对不上。

hero8152 avatar Sep 30 '25 05:09 hero8152

I added https://github.com/ojasaar/ai-toolkit.git as a second remote repository and did a checkout of fix-qwen-multi-control

This time the job continued, and the initial set of the sample images were created, but when the actual training started the job broke with the message: Error running job: tuple index out of range

1248x1728: 777 files
1728x1248: 117 files
1248x1824: 301 files
1824x1248: 111 files
4 buckets made
Generating baseline samples before training
Qwen-Image-Edit-2509-LoRA-v1:   0%|                                                                                                                                                                             | 0/10000 [00:00<?, ?it/s]Error running job: tuple index out of range

========================================
Result:
 - 0 completed jobs
 - 1 failure
========================================
Traceback (most recent call last):
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 120, in <module>
    main()
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 108, in main
    raise e
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 96, in main
    job.run()
  File "D:\ai\LoRa-Trainer\ai-toolkit\jobs\ExtensionJob.py", line 22, in run
    process.run()
  File "D:\ai\LoRa-Trainer\ai-toolkit\jobs\process\BaseSDTrainProcess.py", line 2157, in run
    loss_dict = self.hook_train_loop(batch_list)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\sd_trainer\SDTrainer.py", line 2039, in hook_train_loop
    loss = self.train_single_accumulation(batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\sd_trainer\SDTrainer.py", line 1565, in train_single_accumulation
    conditional_embeds = self.sd.encode_prompt(
                         ^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\toolkit\models\base_model.py", line 1049, in encode_prompt
    return self.get_prompt_embeds(prompt, control_images=control_images)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\diffusion_models\qwen_image\qwen_image_edit_plus.py", line 168, in get_prompt_embeds
    ratio = control_images[i].shape[2] / control_images[i].shape[3]
                                         ~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: tuple index out of range
Qwen-Image-Edit-2509-LoRA-v1:   0%|  

I don't know if the new error is related.

I think it's a bug.You should make sure that none of the three control datasets are empty, otherwise this bug will occur. I used the same dataset as input for the three control datasets, like this:

Image

It worked, but I'm not sure if this will affect the final training effect, I'm trying.

xilai0715 avatar Sep 30 '25 06:09 xilai0715

It worked, but I'm not sure if this will affect the final training effect, I'm trying.

Thank you. My guess would be that the model sees each control-image three times in each step, which would influence the LoRA.

For reference, there is another issue where this bug has been reported #440

JanGerritsen avatar Sep 30 '25 06:09 JanGerritsen

I added https://github.com/ojasaar/ai-toolkit.git as a second remote repository and did a checkout of fix-qwen-multi-control This time the job continued, and the initial set of the sample images were created, but when the actual training started the job broke with the message: Error running job: tuple index out of range

1248x1728: 777 files
1728x1248: 117 files
1248x1824: 301 files
1824x1248: 111 files
4 buckets made
Generating baseline samples before training
Qwen-Image-Edit-2509-LoRA-v1:   0%|                                                                                                                                                                             | 0/10000 [00:00<?, ?it/s]Error running job: tuple index out of range

========================================
Result:
 - 0 completed jobs
 - 1 failure
========================================
Traceback (most recent call last):
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 120, in <module>
    main()
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 108, in main
    raise e
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 96, in main
    job.run()
  File "D:\ai\LoRa-Trainer\ai-toolkit\jobs\ExtensionJob.py", line 22, in run
    process.run()
  File "D:\ai\LoRa-Trainer\ai-toolkit\jobs\process\BaseSDTrainProcess.py", line 2157, in run
    loss_dict = self.hook_train_loop(batch_list)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\sd_trainer\SDTrainer.py", line 2039, in hook_train_loop
    loss = self.train_single_accumulation(batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\sd_trainer\SDTrainer.py", line 1565, in train_single_accumulation
    conditional_embeds = self.sd.encode_prompt(
                         ^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\toolkit\models\base_model.py", line 1049, in encode_prompt
    return self.get_prompt_embeds(prompt, control_images=control_images)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\diffusion_models\qwen_image\qwen_image_edit_plus.py", line 168, in get_prompt_embeds
    ratio = control_images[i].shape[2] / control_images[i].shape[3]
                                         ~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: tuple index out of range
Qwen-Image-Edit-2509-LoRA-v1:   0%|  

I don't know if the new error is related.

I think it's a bug.You should make sure that none of the three control datasets are empty, otherwise this bug will occur. I used the same dataset as input for the three control datasets, like this:

Image It worked, but I'm not sure if this will affect the final training effect, I'm trying.
Image

hero8152 avatar Sep 30 '25 09:09 hero8152

I added https://github.com/ojasaar/ai-toolkit.git as a second remote repository and did a checkout of fix-qwen-multi-control This time the job continued, and the initial set of the sample images were created, but when the actual training started the job broke with the message: Error running job: tuple index out of range

1248x1728: 777 files
1728x1248: 117 files
1248x1824: 301 files
1824x1248: 111 files
4 buckets made
Generating baseline samples before training
Qwen-Image-Edit-2509-LoRA-v1:   0%|                                                                                                                                                                             | 0/10000 [00:00<?, ?it/s]Error running job: tuple index out of range

========================================
Result:
 - 0 completed jobs
 - 1 failure
========================================
Traceback (most recent call last):
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 120, in <module>
    main()
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 108, in main
    raise e
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 96, in main
    job.run()
  File "D:\ai\LoRa-Trainer\ai-toolkit\jobs\ExtensionJob.py", line 22, in run
    process.run()
  File "D:\ai\LoRa-Trainer\ai-toolkit\jobs\process\BaseSDTrainProcess.py", line 2157, in run
    loss_dict = self.hook_train_loop(batch_list)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\sd_trainer\SDTrainer.py", line 2039, in hook_train_loop
    loss = self.train_single_accumulation(batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\sd_trainer\SDTrainer.py", line 1565, in train_single_accumulation
    conditional_embeds = self.sd.encode_prompt(
                         ^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\toolkit\models\base_model.py", line 1049, in encode_prompt
    return self.get_prompt_embeds(prompt, control_images=control_images)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\diffusion_models\qwen_image\qwen_image_edit_plus.py", line 168, in get_prompt_embeds
    ratio = control_images[i].shape[2] / control_images[i].shape[3]
                                         ~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: tuple index out of range
Qwen-Image-Edit-2509-LoRA-v1:   0%|  

I don't know if the new error is related.

I think it's a bug.You should make sure that none of the three control datasets are empty, otherwise this bug will occur. I used the same dataset as input for the three control datasets, like this:

Image It worked, but I'm not sure if this will affect the final training effect, I'm trying.

I tested it and still got an error.

hero8152 avatar Sep 30 '25 12:09 hero8152

same error - all datasets have matching names, ai-toolkit is updated and requirements in place.

lucusmax avatar Oct 01 '25 13:10 lucusmax

ostris commented on this in his discord and said:

"This will happen if there is not a control image. Either in your dataset there is not a matching control image, or not one for sampling"

also he said:

I messaged him and said: "seeing this as a solution the to 'pixel_values' do we have to have all 3 input dataset options selected or we will get an error? I only have 2 inputs but still getting this error"

he replied and said: "You only need one dataset. If you see this error, it means it could not find the matching control images. Keep in mind that it cannot find them in subfolders. I have ran into that. Or a single image is missing a control, that can also cause it. I’ll try to add a better error."

however when I double checked my datasets I have all my images named correctly and the same also my folder structure is just basic folders with images in them in the dataset section so Im still not sure whats going on

Tristan-mc-q avatar Oct 02 '25 03:10 Tristan-mc-q

I double checked.

~/ai/LoRa-Trainer/ai-toolkit/datasets/target # for item in *.jpg ; do test -f ../src/"$item" || echo nope "$item"; done
~/ai/LoRa-Trainer/ai-toolkit/datasets/target # cd ../src
~/ai/LoRa-Trainer/ai-toolkit/datasets/src # for item in *.jpg ; do test -f ../target/"$item" || echo nope "$item"; done
~/ai/LoRa-Trainer/ai-toolkit/datasets/src #

All src and target images match up.

The only other files are the ".txt" files for the captions and the ".aitk_size.json" file in the target directory.

I still get the same error:

Error running job: tuple index out of range

========================================
Result:
 - 0 completed jobs
 - 1 failure
========================================
Traceback (most recent call last):
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 120, in <module>
    main()
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 108, in main
    raise e
  File "D:\ai\LoRa-Trainer\ai-toolkit\run.py", line 96, in main
    job.run()
  File "D:\ai\LoRa-Trainer\ai-toolkit\jobs\ExtensionJob.py", line 22, in run
    process.run()
  File "D:\ai\LoRa-Trainer\ai-toolkit\jobs\process\BaseSDTrainProcess.py", line 2157, in run
    loss_dict = self.hook_train_loop(batch_list)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\sd_trainer\SDTrainer.py", line 2039, in hook_train_loop
    loss = self.train_single_accumulation(batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\sd_trainer\SDTrainer.py", line 1565, in train_single_accumulation
    conditional_embeds = self.sd.encode_prompt(
                         ^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\toolkit\models\base_model.py", line 1049, in encode_prompt
    return self.get_prompt_embeds(prompt, control_images=control_images)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ai\LoRa-Trainer\ai-toolkit\extensions_built_in\diffusion_models\qwen_image\qwen_image_edit_plus.py", line 168, in get_prompt_embeds
    ratio = control_images[i].shape[2] / control_images[i].shape[3]
                                         ~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: tuple index out of range

JanGerritsen avatar Oct 02 '25 10:10 JanGerritsen

I got it to work - through remembering to add control images to the sample prompts....forgot about that. Adding one control image to each of the sample prompts allowed it to go through - running now and seems to be training well. thanks for all your help.

lucusmax avatar Oct 02 '25 16:10 lucusmax

well thats interesting. Im personally still having errors. I even tried removing all samples and disabling sampling completely @lucusmax how many input control images were you using in your training?

Tristan-mc-q avatar Oct 02 '25 18:10 Tristan-mc-q

same here, tested all options. Hope there is a update on the way.

Image Image Image

Running 1 job { "type": "diffusion_trainer", "training_folder": "/app/ai-toolkit/output", "sqlite_db_path": "/app/ai-toolkit/aitk_db.db", "device": "cuda", "trigger_word": null, "performance_log_every": 10, "network": { "type": "lora", "linear": 32, "linear_alpha": 32, "conv": 16, "conv_alpha": 16, "lokr_full_rank": true, "lokr_factor": -1, "network_kwargs": { "ignore_if_contains": [] } }, "save": { "dtype": "bf16", "save_every": 250, "max_step_saves_to_keep": 4, "save_format": "diffusers", "push_to_hub": false }, "datasets": [ { "folder_path": "/app/ai-toolkit/datasets/control", "mask_path": null, "mask_min_value": 0.1, "default_caption": "", "caption_ext": "txt", "caption_dropout_rate": 0.05, "cache_latents_to_disk": false, "is_reg": false, "network_weight": 1, "resolution": [ 512, 768, 1024 ], "controls": [], "shrink_video_to_frames": true, "num_frames": 1, "do_i2v": true, "flip_x": false, "flip_y": false, "control_path_1": "/app/ai-toolkit/datasets/dataset", "control_path_2": null, "control_path_3": null } ], "train": { "batch_size": 1, "bypass_guidance_embedding": false, "steps": 4000, "gradient_accumulation": 1, "train_unet": true, "train_text_encoder": false, "gradient_checkpointing": true, "noise_scheduler": "flowmatch", "optimizer": "adamw8bit", "timestep_type": "weighted", "content_or_style": "balanced", "optimizer_params": { "weight_decay": 0.0001 }, "unload_text_encoder": false, "cache_text_embeddings": false, "lr": 0.0001, "ema_config": { "use_ema": false, "ema_decay": 0.99 }, "skip_first_sample": false, "force_first_sample": false, "disable_sampling": false, "dtype": "bf16", "diff_output_preservation": false, "diff_output_preservation_multiplier": 1, "diff_output_preservation_class": "person", "switch_boundary_every": 1, "loss_type": "mse" }, "model": { "name_or_path": "Qwen/Qwen-Image-Edit-2509", "quantize": true, "qtype": "qfloat8", "quantize_te": true, "qtype_te": "qfloat8", "arch": "qwen_image_edit_plus", "low_vram": false, "model_kwargs": {} }, "sample": { "sampler": "flowmatch", "sample_every": 250, "width": 1024, "height": 1024, "samples": [ { "prompt": "Blondyvisibelv01, blonde wavy hair, standing on sand dune, desert sunrise, golden hour lighting, rippled sand patterns, vast horizon, warm desert tones, flowing fabric, dramatic shadows, expansive landscape, adventure styling", "ctrl_img_1": "/app/ai-toolkit/data/images/0dbc53af-e965-4319-b04b-dae48d4cb0ce.jpg", "seed": 123, "network_multiplier": "1.0" }, { "prompt": "Blondyvisibelv01, blonde wavy hair, standing at kitchen island, modern minimalist kitchen, bright even lighting, marble countertops, sleek appliances, contemporary design, casual chic outfit, clean lines, functional elegance", "ctrl_img_1": "/app/ai-toolkit/data/images/1a0eb221-ce00-47b4-b45b-d2795e1bbb71.jpg", "seed": 123, "network_multiplier": "1.0" }, { "prompt": "Blondyvisibelv01, blonde wavy hair, sitting on luxurious bed, ornate bedroom, warm interior lighting, crystal chandelier, damask wallpaper, silk bedding, elegant nightwear, intimate setting, golden accents, plush textures", "ctrl_img_1": "/app/ai-toolkit/data/images/1bafede2-f39a-4071-9cda-34ddb4f7f5a4.jpg", "seed": 123, "network_multiplier": "1.0" }, { "prompt": "Blondyvisibelv01, blonde wavy hair, elegant pose near ornate fountain, manicured palace gardens, classical architecture, daylight, formal garden setting, sophisticated dress, graceful stance, baroque fountain, hedge maze background, refined atmosphere", "ctrl_img_1": "/app/ai-toolkit/data/images/062a1aca-1d0c-4086-8373-d82a42dda927.jpg", "seed": 123, "network_multiplier": "1.0" }, { "prompt": "Blondyvisibelv01, blonde wavy hair, confident runway walk, bright spotlights, fashion show setting, elevated catwalk, dramatic lighting, professional pose, high fashion outfit, model stance, studio lighting, sleek modern backdrop", "ctrl_img_1": "/app/ai-toolkit/data/images/88b6f45a-2824-4e4a-a403-7e9efeaae56e.jpg", "seed": 123, "network_multiplier": "1.0" }, { "prompt": "Blondyvisibelv01, blonde wavy hair, walking along beach shoreline, sunset lighting, golden hour, ocean waves, wet sand, peaceful expression, flowing summer dress, bare feet, warm orange and pink sky, serene coastal atmosphere, natural lighting", "ctrl_img_1": "/app/ai-toolkit/data/images/e0067bff-ffab-43ed-9baa-04d05f6a1357.jpg", "seed": 123, "network_multiplier": "1.0" } ], "neg": "", "seed": 42, "walk_seed": true, "guidance_scale": 4, "sample_steps": 25, "num_frames": 1, "fps": 1 } } Using SQLite database at /app/ai-toolkit/aitk_db.db Job ID: "965d0607-ac7c-41cc-9b88-583877d1e100" #############################################

Running job: Blondy-Qwen2509-V01

############################################# Running 1 process Loading Qwen Image model Loading transformer config.json: 100%|##########| 339/339 [00:00<00:00, 3.08MB/s] (…)ion_pytorch_model.safetensors.index.json: 199kB [00:00, 339MB/s] transformer/diffusion_pytorch_model-0000(…): 100%|##########| 9.97G/9.97G [00:04<00:00, 2.23GB/s] transformer/diffusion_pytorch_model-0000(…): 100%|##########| 9.99G/9.99G [00:04<00:00, 2.21GB/s] transformer/diffusion_pytorch_model-0000(…): 100%|##########| 9.99G/9.99G [00:05<00:00, 1.94GB/s] transformer/diffusion_pytorch_model-0000(…): 100%|##########| 9.93G/9.93G [00:04<00:00, 2.14GB/s] transformer/diffusion_pytorch_model-0000(…): 100%|##########| 982M/982M [00:00<00:00, 1.03GB/s] Loading checkpoint shards: 100%|##########| 5/5 [00:00<00:00, 42.61it/s] Quantizing Transformer

  • quantizing 60 transformer blocks 100%|##########| 60/60 [00:22<00:00, 2.64it/s]
  • quantizing extras Text Encoder tokenizer_config.json: 4.69kB [00:00, 30.6MB/s] vocab.json: 3.38MB [00:00, 226MB/s] merges.txt: 1.67MB [00:00, 203MB/s] added_tokens.json: 100%|##########| 605/605 [00:00<00:00, 9.80MB/s] special_tokens_map.json: 100%|##########| 613/613 [00:00<00:00, 6.99MB/s] chat_template.jinja: 2.43kB [00:00, 20.4MB/s] config.json: 3.22kB [00:00, 23.8MB/s] model.safetensors.index.json: 57.7kB [00:00, 240MB/s] text_encoder/model-00001-of-00004.safete(…): 100%|##########| 4.97G/4.97G [00:02<00:00, 2.03GB/s] text_encoder/model-00002-of-00004.safete(…): 100%|##########| 4.99G/4.99G [00:02<00:00, 1.87GB/s] text_encoder/model-00003-of-00004.safete(…): 100%|##########| 4.93G/4.93G [00:02<00:00, 1.92GB/s] text_encoder/model-00004-of-00004.safete(…): 100%|##########| 1.69G/1.69G [00:01<00:00, 1.42GB/s] Loading checkpoint shards: 100%|##########| 4/4 [00:00<00:00, 43.42it/s] generation_config.json: 100%|##########| 244/244 [00:00<00:00, 3.66MB/s] Quantizing Text Encoder Loading VAE config.json: 100%|##########| 730/730 [00:00<00:00, 7.50MB/s] vae/diffusion_pytorch_model.safetensors: 100%|##########| 254M/254M [00:00<00:00, 381MB/s]
    Making pipe preprocessor_config.json: 100%|##########| 788/788 [00:00<00:00, 7.89MB/s] tokenizer_config.json: 4.73kB [00:00, 25.8MB/s] vocab.json: 2.78MB [00:00, 196MB/s] processor/tokenizer.json: 100%|##########| 11.4M/11.4M [00:00<00:00, 112MB/s] chat_template.jinja: 1.02kB [00:00, 9.59MB/s] video_preprocessor_config.json: 100%|##########| 904/904 [00:00<00:00, 11.0MB/s] Preparing Model Model Loaded create LoRA network. base dim (rank): 32, alpha: 32 neuron dropout: p=None, rank dropout: p=None, module dropout: p=None apply LoRA to Conv2d with kernel size (3,3). dim (rank): 16, alpha: 16 create LoRA for Text Encoder: 0 modules. create LoRA for U-Net: 840 modules. enable LoRA for U-Net Dataset: /app/ai-toolkit/datasets/control
  • Preprocessing image dimensions 100%|##########| 33/33 [00:00<00:00, 37.90it/s]
  • Found 33 images Bucket sizes for /app/ai-toolkit/datasets/control: 512x512: 33 files 1 buckets made Dataset: /app/ai-toolkit/datasets/control
  • Preprocessing image dimensions 100%|##########| 33/33 [00:00<00:00, 27643.71it/s]
  • Found 33 images Bucket sizes for /app/ai-toolkit/datasets/control: 768x768: 33 files 1 buckets made Dataset: /app/ai-toolkit/datasets/control
  • Preprocessing image dimensions 100%|##########| 33/33 [00:00<00:00, 30588.29it/s]
  • Found 33 images Bucket sizes for /app/ai-toolkit/datasets/control: 1024x1024: 33 files 1 buckets made Generating baseline samples before training Blondy-Qwen2509-V01: 0%| | 0/4000 [00:00<?, ?it/s]Error running job: tuple index out of range ======================================== Result:
  • 0 completed jobs
  • 1 failure ======================================== Traceback (most recent call last): File "/app/ai-toolkit/run.py", line 120, in main() File "/app/ai-toolkit/run.py", line 108, in main raise e File "/app/ai-toolkit/run.py", line 96, in main job.run() File "/app/ai-toolkit/jobs/ExtensionJob.py", line 22, in run process.run() File "/app/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 2154, in run loss_dict = self.hook_train_loop(batch_list) File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 2023, in hook_train_loop loss = self.train_single_accumulation(batch) File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 1549, in train_single_accumulation conditional_embeds = self.sd.encode_prompt( File "/app/ai-toolkit/toolkit/models/base_model.py", line 1069, in encode_prompt return self.get_prompt_embeds(prompt, control_images=control_images) File "/app/ai-toolkit/extensions_built_in/diffusion_models/qwen_image/qwen_image_edit_plus.py", line 172, in get_prompt_embeds ratio = control_images[i].shape[2] / control_images[i].shape[3] IndexError: tuple index out of range Traceback (most recent call last): File "/app/ai-toolkit/run.py", line 120, in main() File "/app/ai-toolkit/run.py", line 108, in main raise e File "/app/ai-toolkit/run.py", line 96, in main job.run() File "/app/ai-toolkit/jobs/ExtensionJob.py", line 22, in run process.run() File "/app/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 2154, in run loss_dict = self.hook_train_loop(batch_list) File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 2023, in hook_train_loop loss = self.train_single_accumulation(batch) File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 1549, in train_single_accumulation conditional_embeds = self.sd.encode_prompt( File "/app/ai-toolkit/toolkit/models/base_model.py", line 1069, in encode_prompt return self.get_prompt_embeds(prompt, control_images=control_images) File "/app/ai-toolkit/extensions_built_in/diffusion_models/qwen_image/qwen_image_edit_plus.py", line 172, in get_prompt_embeds ratio = control_images[i].shape[2] / control_images[i].shape[3] IndexError: tuple index out of range Blondy-Qwen2509-V01: 0%| | 0/4000 [00:01<?, ?it/s]

Astroburner avatar Oct 03 '25 16:10 Astroburner

Update: Can it be that the sampler needs 3 images as input? and it has problems when i just give him 1?

Image Image

Maybe it helps to find the issue

Astroburner avatar Oct 06 '25 12:10 Astroburner

ostris commented on this in his discord and said:

"This will happen if there is not a control image. Either in your dataset there is not a matching control image, or not one for sampling"

also he said:

I messaged him and said: "seeing this as a solution the to 'pixel_values' do we have to have all 3 input dataset options selected or we will get an error? I only have 2 inputs but still getting this error"

he replied and said: "You only need one dataset. If you see this error, it means it could not find the matching control images. Keep in mind that it cannot find them in subfolders. I have ran into that. Or a single image is missing a control, that can also cause it. I’ll try to add a better error."

however when I double checked my datasets I have all my images named correctly and the same also my folder structure is just basic folders with images in them in the dataset section so Im still not sure whats going on

Did you solve that problem meanwhile? I could avoid that error after taking 3 datasets with different names (but exact same pngs in it). But actually, I am not sure if this is ideal... anyway I get errors a little later in the pipeline, so I cannot report about any results.

DieserBobby avatar Oct 06 '25 17:10 DieserBobby

well thats interesting. Im personally still having errors. I even tried removing all samples and disabling sampling completely @lucusmax how many input control images were you using in your training?

Did anybody get that to a run? Its also difficult to know, if the template has been updated. Before trying again, I would like to know what to change... or do we only have to wait?

DieserBobby avatar Oct 07 '25 13:10 DieserBobby

Since the new update that was supposed to fix the issue I am still getting errors when trying to run with and without the text encoder being cached. This is with 2 control input images. We may need to use a different training script for now. Im going to look to see what others support multi input training today

Tristan-mc-q avatar Oct 07 '25 14:10 Tristan-mc-q

In China, they already have it on bilibili ... but I don't understand a word :-) You said: "the new update", where do you check for new updates? I suppose, that Ostris is changing the template on runpod as soon as he got a new version... but where to check if there has happened something regarding updates?

You are trying to make a LoRA that reacts on 2 control_images? What is the idea behind it, if I may ask? I am trying to input only 1 control ... did you try that already and made it run?

DieserBobby avatar Oct 07 '25 16:10 DieserBobby

I think this is related. I'm seeing

======================================== Result:

  • 0 completed jobs
  • 1 failure ======================================== Traceback (most recent call last): Traceback (most recent call last): File "/workspace/ai-toolkit/venv/lib/python3.11/site-packages/transformers/feature_extraction_utils.py", line 92, in getattr File "/workspace/ai-toolkit/venv/lib/python3.11/site-packages/transformers/feature_extraction_utils.py", line 92, in getattr return self.data[item]return self.data[item]

KeyErrorKeyError: : 'pixel_values''pixel_values'

yic03685 avatar Oct 08 '25 06:10 yic03685

Encountering the same errors (tuple index out of range and / or KeyError pixel_values) when trying to train a qwen-image-edit-2509 LoRA. I tried different scenarios.

I have a dataset of 31 images, all in 1024x1024 resolution:

  • control dataset has 31 original images, with an accompanying txt file "make it a woven texture"
  • target dataset has 31 images with the same name as the control dataset. No txt file. They are the "texturized" versions of the control dataset

I am running ai-toolkit on runpod using your template, on an RTX 6000 Pro. Here are the scenarios I tried:

  • Qwen-Image-Edit-2509, adjusted learning rate to 0,0003, save every 500 steps, 5000 steps. Linked the target dataset and control dataset (control 1) in the dataset section. Disabled all sampling. All other settings are defaults. This gives me the following error (tuple index out of range):
========================================
Result:
 - 0 completed jobs
 - 1 failure
========================================
Traceback (most recent call last):
  File "/app/ai-toolkit/run.py", line 120, in <module>
    main()
  File "/app/ai-toolkit/run.py", line 108, in main
    raise e
  File "/app/ai-toolkit/run.py", line 96, in main
    job.run()
  File "/app/ai-toolkit/jobs/ExtensionJob.py", line 22, in run
    process.run()
  File "/app/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 2154, in run
    loss_dict = self.hook_train_loop(batch_list)
  File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 2023, in hook_train_loop
    loss = self.train_single_accumulation(batch)
  File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 1549, in train_single_accumulation
    conditional_embeds = self.sd.encode_prompt(
  File "/app/ai-toolkit/toolkit/models/base_model.py", line 1069, in encode_prompt
    return self.get_prompt_embeds(prompt, control_images=control_images)
  File "/app/ai-toolkit/extensions_built_in/diffusion_models/qwen_image/qwen_image_edit_plus.py", line 172, in get_prompt_embeds
    ratio = control_images[i].shape[2] / control_images[i].shape[3]
IndexError: tuple index out of range
Traceback (most recent call last):
  File "/app/ai-toolkit/run.py", line 120, in <module>
    main()
  File "/app/ai-toolkit/run.py", line 108, in main
    raise e
  File "/app/ai-toolkit/run.py", line 96, in main
    job.run()
  File "/app/ai-toolkit/jobs/ExtensionJob.py", line 22, in run
    process.run()
  File "/app/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 2154, in run
    loss_dict = self.hook_train_loop(batch_list)
  File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 2023, in hook_train_loop
    loss = self.train_single_accumulation(batch)
  File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 1549, in train_single_accumulation
    conditional_embeds = self.sd.encode_prompt(
  File "/app/ai-toolkit/toolkit/models/base_model.py", line 1069, in encode_prompt
    return self.get_prompt_embeds(prompt, control_images=control_images)
  File "/app/ai-toolkit/extensions_built_in/diffusion_models/qwen_image/qwen_image_edit_plus.py", line 172, in get_prompt_embeds
    ratio = control_images[i].shape[2] / control_images[i].shape[3]
IndexError: tuple index out of range
  • I gave it another go, and added one sample image - while keeping sampling disabled: same error
  • enabling sampling: same error
  • disabled sampling, removed all sample prompts: same error
  • I added the same control dataset to control2 and control3: different error:
Dataset: /app/ai-toolkit/datasets/woven_fabric_target
  -  Preprocessing image dimensions
100%|##########| 21/21 [00:00<00:00, 14956.76it/s]
  -  Found 21 images
Bucket sizes for /app/ai-toolkit/datasets/woven_fabric_target:
512x512: 21 files
1 buckets made
Dataset: /app/ai-toolkit/datasets/woven_fabric_target
  -  Preprocessing image dimensions
100%|##########| 21/21 [00:00<00:00, 16014.62it/s]
  -  Found 21 images
Bucket sizes for /app/ai-toolkit/datasets/woven_fabric_target:
768x768: 21 files
1 buckets made
Dataset: /app/ai-toolkit/datasets/woven_fabric_target
  -  Preprocessing image dimensions
100%|##########| 21/21 [00:00<00:00, 17490.15it/s]
  -  Found 21 images
Bucket sizes for /app/ai-toolkit/datasets/woven_fabric_target:
1024x1024: 21 files
1 buckets made
Skipping first sample due to config setting
qwen_image_edit_2509_woven_fabric_01:   0%|          | 0/5000 [00:00<?, ?it/s]Error running job: 
========================================
Result:
 - 0 completed jobs
 - 1 failure
========================================
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/transformers/feature_extraction_utils.py", line 92, in __getattr__
    return self.data[item]
KeyError: 'pixel_values'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/app/ai-toolkit/run.py", line 120, in <module>
    main()
  File "/app/ai-toolkit/run.py", line 108, in main
    raise e
  File "/app/ai-toolkit/run.py", line 96, in main
    job.run()
  File "/app/ai-toolkit/jobs/ExtensionJob.py", line 22, in run
    process.run()
  File "/app/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 2154, in run
    loss_dict = self.hook_train_loop(batch_list)
  File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 2023, in hook_train_loop
    loss = self.train_single_accumulation(batch)
  File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 1549, in train_single_accumulation
    conditional_embeds = self.sd.encode_prompt(
  File "/app/ai-toolkit/toolkit/models/base_model.py", line 1069, in encode_prompt
    return self.get_prompt_embeds(prompt, control_images=control_images)
  File "/app/ai-toolkit/extensions_built_in/diffusion_models/qwen_image/qwen_image_edit_plus.py", line 183, in get_prompt_embeds
    prompt_embeds, prompt_embeds_mask = self.pipeline.encode_prompt(
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 318, in encode_prompt
    prompt_embeds, prompt_embeds_mask = self._get_qwen_prompt_embeds(prompt, image, device)
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 265, in _get_qwen_prompt_embeds
    pixel_values=model_inputs.pixel_values,
  File "/usr/local/lib/python3.10/dist-packages/transformers/feature_extraction_utils.py", line 94, in __getattr__
    raise AttributeError
AttributeError
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/transformers/feature_extraction_utils.py", line 92, in __getattr__
    return self.data[item]
KeyError: 'pixel_values'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/app/ai-toolkit/run.py", line 120, in <module>
    main()
  File "/app/ai-toolkit/run.py", line 108, in main
    raise e
  File "/app/ai-toolkit/run.py", line 96, in main
    job.run()
  File "/app/ai-toolkit/jobs/ExtensionJob.py", line 22, in run
    process.run()
  File "/app/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 2154, in run
    loss_dict = self.hook_train_loop(batch_list)
  File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 2023, in hook_train_loop
    loss = self.train_single_accumulation(batch)
  File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 1549, in train_single_accumulation
    conditional_embeds = self.sd.encode_prompt(
  File "/app/ai-toolkit/toolkit/models/base_model.py", line 1069, in encode_prompt
    return self.get_prompt_embeds(prompt, control_images=control_images)
  File "/app/ai-toolkit/extensions_built_in/diffusion_models/qwen_image/qwen_image_edit_plus.py", line 183, in get_prompt_embeds
    prompt_embeds, prompt_embeds_mask = self.pipeline.encode_prompt(
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 318, in encode_prompt
    prompt_embeds, prompt_embeds_mask = self._get_qwen_prompt_embeds(prompt, image, device)
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 265, in _get_qwen_prompt_embeds
    pixel_values=model_inputs.pixel_values,
  File "/usr/local/lib/python3.10/dist-packages/transformers/feature_extraction_utils.py", line 94, in __getattr__
    raise AttributeError
AttributeError
qwen_image_edit_2509_woven_fabric_01:   0%|          | 0/5000 [00:01<?, ?it/s]
  • Removed the control2 and control3 again, and only checked 1024 resolution: tuple index error from before
  • Added the control2 and control3 again with only 1024 resolution checked: attribute error from before:
Running 1 job
{
    "type": "diffusion_trainer",
    "training_folder": "/app/ai-toolkit/output",
    "sqlite_db_path": "/app/ai-toolkit/aitk_db.db",
    "device": "cuda",
    "trigger_word": null,
    "performance_log_every": 10,
    "network": {
        "type": "lora",
        "linear": 32,
        "linear_alpha": 32,
        "conv": 16,
        "conv_alpha": 16,
        "lokr_full_rank": true,
        "lokr_factor": -1,
        "network_kwargs": {
            "ignore_if_contains": []
        }
    },
    "save": {
        "dtype": "bf16",
        "save_every": 500,
        "max_step_saves_to_keep": 4,
        "save_format": "diffusers",
        "push_to_hub": false
    },
    "datasets": [
        {
            "folder_path": "/app/ai-toolkit/datasets/woven_fabric_target",
            "mask_path": null,
            "mask_min_value": 0.1,
            "default_caption": "",
            "caption_ext": "txt",
            "caption_dropout_rate": 0.05,
            "cache_latents_to_disk": false,
            "is_reg": false,
            "network_weight": 1,
            "resolution": [
                1024
            ],
            "controls": [],
            "shrink_video_to_frames": true,
            "num_frames": 1,
            "do_i2v": true,
            "flip_x": false,
            "flip_y": false,
            "control_path_1": "/app/ai-toolkit/datasets/woven_fabric_control",
            "control_path_2": "/app/ai-toolkit/datasets/woven_fabric_control",
            "control_path_3": "/app/ai-toolkit/datasets/woven_fabric_control"
        }
    ],
    "train": {
        "batch_size": 1,
        "bypass_guidance_embedding": false,
        "steps": 5000,
        "gradient_accumulation": 1,
        "train_unet": true,
        "train_text_encoder": false,
        "gradient_checkpointing": true,
        "noise_scheduler": "flowmatch",
        "optimizer": "adamw8bit",
        "timestep_type": "weighted",
        "content_or_style": "balanced",
        "optimizer_params": {
            "weight_decay": 0.0001
        },
        "unload_text_encoder": false,
        "cache_text_embeddings": false,
        "lr": 0.0003,
        "ema_config": {
            "use_ema": false,
            "ema_decay": 0.99
        },
        "skip_first_sample": true,
        "force_first_sample": false,
        "disable_sampling": true,
        "dtype": "bf16",
        "diff_output_preservation": false,
        "diff_output_preservation_multiplier": 1,
        "diff_output_preservation_class": "person",
        "switch_boundary_every": 1,
        "loss_type": "mse"
    },
    "model": {
        "name_or_path": "Qwen/Qwen-Image-Edit-2509",
        "quantize": true,
        "qtype": "qfloat8",
        "quantize_te": true,
        "qtype_te": "qfloat8",
        "arch": "qwen_image_edit_plus",
        "low_vram": true,
        "model_kwargs": {}
    },
    "sample": {
        "sampler": "flowmatch",
        "sample_every": 250,
        "width": 1024,
        "height": 1024,
        "samples": [],
        "neg": "",
        "seed": 42,
        "walk_seed": true,
        "guidance_scale": 4,
        "sample_steps": 25,
        "num_frames": 1,
        "fps": 1
    }
}
Using SQLite database at /app/ai-toolkit/aitk_db.db
Job ID: "ed79a1a1-2987-4646-a681-8d802e17b6ac"
#############################################
# Running job: qwen_image_edit_2509_woven_fabric_01
#############################################
Running  1 process
Loading Qwen Image model
Loading transformer
Loading checkpoint shards: 100%|##########| 5/5 [00:00<00:00, 45.33it/s]
Quantizing Transformer
 - quantizing 60 transformer blocks
100%|##########| 60/60 [00:29<00:00,  2.04it/s]
 - quantizing extras
Moving transformer to CPU
Text Encoder
Loading checkpoint shards: 100%|##########| 4/4 [00:00<00:00, 41.87it/s]
Quantizing Text Encoder
Loading VAE
Making pipe
Preparing Model
Model Loaded
create LoRA network. base dim (rank): 32, alpha: 32
neuron dropout: p=None, rank dropout: p=None, module dropout: p=None
apply LoRA to Conv2d with kernel size (3,3). dim (rank): 16, alpha: 16
create LoRA for Text Encoder: 0 modules.
create LoRA for U-Net: 840 modules.
enable LoRA for U-Net
Dataset: /app/ai-toolkit/datasets/woven_fabric_target
  -  Preprocessing image dimensions
100%|##########| 21/21 [00:00<00:00, 14855.86it/s]
  -  Found 21 images
Bucket sizes for /app/ai-toolkit/datasets/woven_fabric_target:
1024x1024: 21 files
1 buckets made
Skipping first sample due to config setting
qwen_image_edit_2509_woven_fabric_01:   0%|          | 0/5000 [00:00<?, ?it/s]Error running job: 
========================================
Result:
 - 0 completed jobs
 - 1 failure
========================================
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/transformers/feature_extraction_utils.py", line 92, in __getattr__
    return self.data[item]
KeyError: 'pixel_values'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/app/ai-toolkit/run.py", line 120, in <module>
    main()
  File "/app/ai-toolkit/run.py", line 108, in main
    raise e
  File "/app/ai-toolkit/run.py", line 96, in main
    job.run()
  File "/app/ai-toolkit/jobs/ExtensionJob.py", line 22, in run
    process.run()
  File "/app/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 2154, in run
    loss_dict = self.hook_train_loop(batch_list)
  File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 2023, in hook_train_loop
    loss = self.train_single_accumulation(batch)
  File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 1549, in train_single_accumulation
    conditional_embeds = self.sd.encode_prompt(
  File "/app/ai-toolkit/toolkit/models/base_model.py", line 1069, in encode_prompt
    return self.get_prompt_embeds(prompt, control_images=control_images)
  File "/app/ai-toolkit/extensions_built_in/diffusion_models/qwen_image/qwen_image_edit_plus.py", line 183, in get_prompt_embeds
    prompt_embeds, prompt_embeds_mask = self.pipeline.encode_prompt(
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 318, in encode_prompt
    prompt_embeds, prompt_embeds_mask = self._get_qwen_prompt_embeds(prompt, image, device)
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 265, in _get_qwen_prompt_embeds
    pixel_values=model_inputs.pixel_values,
  File "/usr/local/lib/python3.10/dist-packages/transformers/feature_extraction_utils.py", line 94, in __getattr__
    raise AttributeError
AttributeError
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/transformers/feature_extraction_utils.py", line 92, in __getattr__
    return self.data[item]
KeyError: 'pixel_values'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/app/ai-toolkit/run.py", line 120, in <module>
    main()
  File "/app/ai-toolkit/run.py", line 108, in main
    raise e
  File "/app/ai-toolkit/run.py", line 96, in main
    job.run()
  File "/app/ai-toolkit/jobs/ExtensionJob.py", line 22, in run
    process.run()
  File "/app/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 2154, in run
    loss_dict = self.hook_train_loop(batch_list)
  File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 2023, in hook_train_loop
    loss = self.train_single_accumulation(batch)
  File "/app/ai-toolkit/extensions_built_in/sd_trainer/SDTrainer.py", line 1549, in train_single_accumulation
    conditional_embeds = self.sd.encode_prompt(
  File "/app/ai-toolkit/toolkit/models/base_model.py", line 1069, in encode_prompt
    return self.get_prompt_embeds(prompt, control_images=control_images)
  File "/app/ai-toolkit/extensions_built_in/diffusion_models/qwen_image/qwen_image_edit_plus.py", line 183, in get_prompt_embeds
    prompt_embeds, prompt_embeds_mask = self.pipeline.encode_prompt(
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 318, in encode_prompt
    prompt_embeds, prompt_embeds_mask = self._get_qwen_prompt_embeds(prompt, image, device)
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 265, in _get_qwen_prompt_embeds
    pixel_values=model_inputs.pixel_values,
  File "/usr/local/lib/python3.10/dist-packages/transformers/feature_extraction_utils.py", line 94, in __getattr__
    raise AttributeError
AttributeError
qwen_image_edit_2509_woven_fabric_01:   0%|          | 0/5000 [00:01<?, ?it/s]

wouterverweirder avatar Oct 08 '25 11:10 wouterverweirder

@wouterverweirder As I understood: no txt-files are needed with qwen-image-edit-2509. If you must use them (instead of filling the cation-fields), I would try to make them go together with the target-set, not with the control-set. In tutorials that I have seen so far, they put caption-text into the fields under the target-images. Maybe give it a try? But, keep in mind, that although I surpassed that "IndexError: tuple index out of range"... I am also stuck with some error that seems to be related to some still existing issues with AI-toolkit regarding that model.

DieserBobby avatar Oct 09 '25 11:10 DieserBobby

In my case, adding dummy images to sampe's “Control Image 2” and “Control Image 3” resolved “IndexError: tuple index out of range” issue. “Control Dataset 2” and “Control Dataset 3” remain empty.

dtakura avatar Oct 13 '25 20:10 dtakura

In my case, adding dummy images to sampe's “Control Image 2” and “Control Image 3” resolved “IndexError: tuple index out of range” issue. “Control Dataset 2” and “Control Dataset 3” remain empty.

@dtakura oh thats insterestings. So you are saying that you just add a dataset folder that has no images in it?

Tristan-mc-q avatar Oct 13 '25 20:10 Tristan-mc-q

you can leave them without datasets, no need fir dummies

k_takayama @.***> schrieb am Mo., 13. Okt. 2025, 22:01:

dtakura left a comment (ostris/ai-toolkit#441) https://github.com/ostris/ai-toolkit/issues/441#issuecomment-3398892047

In my case, adding dummy images to sampe's “Control Image 2” and “Control Image 3” resolved “IndexError: tuple index out of range” issue. “Control Dataset 2” and “Control Dataset 3” remain empty.

— Reply to this email directly, view it on GitHub https://github.com/ostris/ai-toolkit/issues/441#issuecomment-3398892047, or unsubscribe https://github.com/notifications/unsubscribe-auth/APBSPWP24PNIPO6H5U4D6LT3XQAJ5AVCNFSM6AAAAACHWDHXL2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGOJYHA4TEMBUG4 . You are receiving this because you commented.Message ID: @.***>

DieserBobby avatar Oct 13 '25 22:10 DieserBobby

@dtakura oh thats insterestings. So you are saying that you just add a dataset folder that has no images in it?

@Tristan-mc-q Sorry, I tried again and realized I was mistaken. Sampe's “Control Image 2” and “Control Image 3” were unrelated.

“Cache Text Embeddings” has been enabled. After enabling it, no errors occurred. No dummy datasets are required.

dtakura avatar Oct 13 '25 23:10 dtakura

Still getting this error 2 weeks later on the runpod image here. Matched 100% on the uploaded control and target images. Unclear from the above whether there is an actual resolution.

wordbrew avatar Oct 14 '25 21:10 wordbrew

Still getting this error 2 weeks later on the runpod image here. Matched 100% on the uploaded control and target images. Unclear from the above whether there is an actual resolution.

Did you check the Textencoder option? because you should! no extra text-files, no dummy data-sets. I had put caption into the fields of target-pictures in the target-dataset.

DieserBobby avatar Oct 14 '25 23:10 DieserBobby

Still getting this error 2 weeks later on the runpod image here. Matched 100% on the uploaded control and target images. Unclear from the above whether there is an actual resolution.

Did you check the Textencoder option? because you should! no extra text-files, no dummy data-sets. I had put caption into the fields of target-pictures in the target-dataset.

When I enabled Text Encoder Caching, I got: "KeyError: 'pixel_values'". So out of the pan and into the fire, it seems.

wordbrew avatar Oct 15 '25 00:10 wordbrew