Samit comments

Results 7 comments of


                                            Samit

Problem of TimeUpsamle2x, decoder output frames != encoder input frames

"we consider the first frame of a video to be an image..." I see, the first frame is always encoded from the repeated k-1 1st frames. But for upsampling, the...

Bug report: AttnBlock3D reshape disorder

> Sorry for that. We merge that to fix this bug. thanks. btw, since the computation logic is changed, the model may require re-training.

SVD-T2V weights

+1 Looking forward to the open-source of text2video model

Is it really feasible to train a video dit without inserting temporal transformers or attention modules?

I see. So attention map complexity will be (H*W*T)^2. Is it feasible for long video training? Are there any generation results using the train code? (Loss curve in diffusion model...

feat: Add NaViT

Please supplement README on accuracy and performance compared to ViT

Add RecResizeNormImg in Rec Transform to manage padding and norm in resize, add yaml of crnn for server version [WIP]

Please report the results for crnn server version and upload the checkpoint and mindir.

添加断点续训、checkpoint保存、训练日志保存三种功能，丰富Loss输出信息，边训边验适配eval_start_epoch和eval_interval

Thanks. checkpoint保存：每个epoch结束保存ckpt。这个可选last_k 或者top_k保存策略。