Stepfunction
Stepfunction
This is frankly bizarre to me. In `transformer_lumina2.py`, in the definition of `Lumina2Transformer2DModel`, the `forward()` function clearly has `encoder_attention_mask` in it: ``` def forward( self, hidden_states: torch.Tensor, timestep: torch.Tensor, encoder_hidden_states:...
Solved it! That was a weird issues with diffusers. The `Lumina2TransformerBlock` in `transformer_lumina2.py` has a different signature than the `forward` call does in the `Lumina2Transformer2DModel` call. It's possible to get...
I can confirm that I am experiencing the same thing on my end.
I can confirm that I am getting the same AttributeError as @jpXerxes after cloning the latest sd3 branch Able to bypass the issue and begin training by adding --cache_text_encoder_outputs to...
You can also remove the sd scripts directory and replace it with the latest version of the sd3 branch. On Thu, Aug 15, 2024, 3:09 PM jpXerxes ***@***.***> wrote: >...
With a 24GB card, I run out of VRAM after about 30 or so training steps.
I understand the potential for the dataset configuration file, but it's a little redundant if you want the same dataset at 3 different resolutions. It could definitely be constructed automatically...
What exactly is split mode and train blocks?
Very much appreciate the response. Thank you!
My initial attempt with a LR or 1e-5 overtrained rapidly. A second attempt with a LR of 2e-6 seems to be more stable so far.