AnimateDiff
AnimateDiff copied to clipboard
About training config: unet_use_cross_frame_attention
I'm going to train the motion module myself. After hard preparation and debugging work, my training can finally be executed successfully, but I can't confirm the results. The output motion module cannot work well, the result images are almost the same. It's hard to recognize moving things.
Besides, I found that when I turn on unet_use_cross_frame_attention, the module 'SparseCausalAttention2D' cannot be found. Could you please help me with trainning config and missing 'SparseCausalAttention2D' module?
Hi, can you tell me your learning rate and train dataset? I also tried to train the motion module on a small dataset. I set batch size =4 and lr=3e-5, but the train result didn't look good and the train loss didn't decrease.
@tacit0428 Hi, I used the dataset from https://github.com/m-bain/webvid. I just use 30 videos of results_2M_train.csv, which contanins the keywords: smile, camera, look, (man | woman | girl |boy) . I also confused with the configuration, the follow is my training config:
- learning_rate: 8.0e-05
- num_examples: 30
- max_train_steps: 8000
I also cannot get good motion module yet. Besides, my trained model cannot be used for inference v2, therefore I think the training scripts util now are suitable for mm-14 rather than mm-15 and mm-15_v2. I hope it helps.
It seems good. I can use inference v2 and mm-15_v2. But I can't turn on unet_use_cross_frame_attention either. I don't know if it's essential. Did you turn on this parameter in your training? Besides, I found that the training result is not good when I use stable diffusion without any lora.
My result gif is lack of animate motion, the movments are barely to be noticed. I don't turn on the unet_use_cross_frame_attention. According to the paper, I guess it helps the consistency cross the frames. But SparseCausalAttention2D
cannot be found.
I don't konw whether lora is necessary, but I think the motion module is supposed to work well without other control components.
It seems that I need pay more effort on the config or wait for more unopend details.
Yes, I agree. I think that the provided motion module looks the same like your results. I tried the author's model weights, the generated animate motion is also slight. I think that the motion module try to realize good consistency. I will try to train it with different configs later.
hi, have you solved this problem yet? The missing "SparseCausalAttention2D" module?
I have just been reading the source code. And when I was reading the attention.py file which can be found in this link, I found that "SparseCausalAttention2D" had not been defined or been imported. It's really weird.
if unet_use_cross_frame_attention: self.attn1 = SparseCausalAttention2D( query_dim=dim, heads=num_attention_heads, dim_head=attention_head_dim, dropout=dropout, bias=attention_bias, cross_attention_dim=cross_attention_dim if only_cross_attention else None, upcast_attention=upcast_attention, )
I'm not sure if it's because I'm so bad at reading code.
It looks like the class "SparseCausalAttention2D" is missing which should be defined in the attention.py.
@liuchangzong No, not yet. The implementation of this class is Unpublished
ok,thanks a lot
Same problem, motion results is poor.
SparseCausalAttention2D is Unpublished?