Lumina2Transformer2DModel.forward() got an unexpected keyword argument 'use_mask_in_transformer
on runpod
probably related on this error https://github.com/huggingface/diffusers/pull/10776#discussion_r1953806298 hence I won't be using sample prompts for now
Generating baseline samples before training
Error running job: Lumina2Transformer2DModel.forward() got an unexpected keyword argument 'use_mask_in_transformer'
========================================
Result:
- 0 completed jobs
- 1 failure
========================================
Traceback (most recent call last):
File "/workspace/ai-toolkit/run.py", line 97, in <module>
main()
File "/workspace/ai-toolkit/run.py", line 93, in main
raise e
File "/workspace/ai-toolkit/run.py", line 85, in main
job.run()
File "/workspace/ai-toolkit/jobs/ExtensionJob.py", line 22, in run
process.run()
File "/workspace/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 1827, in run
self.sample(self.step_num)
File "/workspace/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 321, in sample
self.sd.generate_images(gen_img_config_list, sampler=sample_config.sampler)
File "/workspace/ai-toolkit/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/workspace/ai-toolkit/toolkit/stable_diffusion_model.py", line 1502, in generate_images
img = pipeline(
^^^^^^^^^
File "/workspace/ai-toolkit/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/workspace/ai-toolkit/venv/lib/python3.11/site-packages/diffusers/pipelines/lumina2/pipeline_lumina2.py", line 703, in __call__
noise_pred_cond = self.transformer(
^^^^^^^^^^^^^^^^^
File "/workspace/ai-toolkit/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/ai-toolkit/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Lumina2Transformer2DModel.forward() got an unexpected keyword argument 'use_mask_in_transformer'
same here
same , did you found solution?
i jsut disable the sample prompts also the training isnt working as intended. No changes in the output in my comfy UI
!pip uninstall diffusers -y
!python -m pip cache purge
!pip install git+https://github.com/huggingface/diffusers.git
solved problem for me
Same question.
!pip uninstall diffusers -y !python -m pip cache purge !pip install git+https://github.com/huggingface/diffusers.gitsolved problem for me
Tried this no luck. Which diffusers version are you using now?
pip show diffusers
!pip uninstall diffusers -y !python -m pip cache purge !pip install git+https://github.com/huggingface/diffusers.gitsolved problem for me
not working for me either, can you show your diffuser version?
not working for me either, can you show your diffuser version?
0.33.0.dev0
Hi everyone, if you are still looking at this problem, the hint will be to disable 'sample' from the training script. It was first stated by the poster of this issue, but I overlooked it.
This is frankly bizarre to me. In transformer_lumina2.py, in the definition of Lumina2Transformer2DModel, the forward() function clearly has encoder_attention_mask in it:
def forward(
self,
hidden_states: torch.Tensor,
timestep: torch.Tensor,
encoder_hidden_states: torch.Tensor,
encoder_attention_mask: torch.Tensor,
attention_kwargs: Optional[Dict[str, Any]] = None,
return_dict: bool = True,
) -> Union[torch.Tensor, Transformer2DModelOutput]:
Where could this error possibly be coming from?
Solved it! That was a weird issues with diffusers. The Lumina2TransformerBlock in transformer_lumina2.py has a different signature than the forward call does in the Lumina2Transformer2DModel call. It's possible to get it to run by adjusting pipeline_lumina2.py to be (starting at line 703):
noise_pred_cond = self.transformer(
hidden_states=latents,
timestep=current_timestep,
encoder_hidden_states=prompt_embeds,
### REPLACE encoder_attention_mask WITH attention_mask ###
attention_mask=prompt_attention_mask,
# encoder_attention_mask=prompt_attention_mask,
return_dict=False,
### REMOVE attention_kwargs ###
# attention_kwargs=self.attention_kwargs,
)[0]
# perform normalization-based guidance scale on a truncated timestep interval
if self.do_classifier_free_guidance and not do_classifier_free_truncation:
noise_pred_uncond = self.transformer(
hidden_states=latents,
timestep=current_timestep,
encoder_hidden_states=negative_prompt_embeds,
### REPLACE encoder_attention_mask WITH attention_mask ###
attention_mask=negative_prompt_attention_mask,
# encoder_attention_mask=negative_prompt_attention_mask,
return_dict=False,
### REMOVE attention_kwargs ###
# attention_kwargs=self.attention_kwargs,
)[0]
After that was updated, the samples generated correctly (well, as correctly as can be expected as far as hands go...)