suzhenghang

Results 24 issues of suzhenghang

Hi @unsky , The performance in your experiment is amazing. By the way, did you only replace the SoftmaxWithLoss with the focal loss layer in RPN layer or in both...

Great work! Have there been any experimental results on integrating it into the Damo Text-to-Video system?

Do you have any good solutions for the flickering issue in generated videos?

I got the fllowing logs after running: python test_flops.py --config-file configs/RegNetX-4.0GF.ini,i wonder whether these warnings are normal? [INFO] Register count_convNd() for . [INFO] Register count_bn() for . [INFO] Register zero_ops()...

Do you have any knowledge of [VideoLDM](https://research.nvidia.com/labs/toronto-ai/VideoLDM/), and is it possible to integrate its algorithms to further enhance the capabilities of current models, such as generating longer videos?

enhancement

[link](https://github.com/ExponentialML/Text-To-Video-Finetuning/blob/main/utils/dataset.py#L580), device = torch.device("cuda" if torch.cuda.is_available() else "cpu") cached_latent = torch.load(self.cached_data_list[index], map_location=device) Otherwise, in multi-GPU distributed training, the first GPU may occupy excessive VRAM compared to the other GPUs.

bug

Do you have any good solutions for the flickering issue in generated videos?

您有试过加了韵律特征后在多说话人上训练嘛?我这边多说话人训练效果没有单人训练的好,单人效果非常逼真