Stepfunction

Results 28 comments of Stepfunction

This is frankly bizarre to me. In `transformer_lumina2.py`, in the definition of `Lumina2Transformer2DModel`, the `forward()` function clearly has `encoder_attention_mask` in it: ``` def forward( self, hidden_states: torch.Tensor, timestep: torch.Tensor, encoder_hidden_states:...

Solved it! That was a weird issues with diffusers. The `Lumina2TransformerBlock` in `transformer_lumina2.py` has a different signature than the `forward` call does in the `Lumina2Transformer2DModel` call. It's possible to get...

I can confirm that I am experiencing the same thing on my end.

I can confirm that I am getting the same AttributeError as @jpXerxes after cloning the latest sd3 branch Able to bypass the issue and begin training by adding --cache_text_encoder_outputs to...

You can also remove the sd scripts directory and replace it with the latest version of the sd3 branch. On Thu, Aug 15, 2024, 3:09 PM jpXerxes ***@***.***> wrote: >...

With a 24GB card, I run out of VRAM after about 30 or so training steps.

I understand the potential for the dataset configuration file, but it's a little redundant if you want the same dataset at 3 different resolutions. It could definitely be constructed automatically...

What exactly is split mode and train blocks?

Very much appreciate the response. Thank you!

My initial attempt with a LR or 1e-5 overtrained rapidly. A second attempt with a LR of 2e-6 seems to be more stable so far.