Wensong Song issues

Results 7 issues of


                                            Wensong Song

RuntimeError: Trying to resize storage that is not resizable

I want to use Multi-Task Facial Landmark (MTFL) dataset to train DDPM. I use the code bellow. ``` python from denoising_diffusion_pytorch import Unet, GaussianDiffusion, Trainer model = Unet( dim =...

difference between finetuning the unet's image layers and training motion modules

What is the difference between finetuning the unet's image layers and training motion modules? Suppose I want to train animatediff on a small new dataset (about 72 minutes of video...

When will the training code of SparseCtrl be released?

Thanks to the author for his work! When will the training code of SparseCtrl be released?

How to limit the generated text token to a maximum of 77?

When I show Video-LLava a short video, given inp = 'Could you please provide a detailed description for this video? Your comprehensive video caption should allow listeners to visualize the...

Relation of Video-LLaVA and LanguageBind

Excellent job! I have three questions that are not clear to me. 1. I have some problems with Relation of Video-LLaVA and LanguageBind. Has Video-LLaVa use video encoder of LanguageBind?...

The length of text that the text encoder can handle

``` import torch from languagebind import LanguageBindVideo, LanguageBindVideoTokenizer, LanguageBindVideoProcessor pretrained_ckpt = 'LanguageBind/LanguageBind_Video_FT' # also 'LanguageBind/LanguageBind_Video' model = LanguageBindVideo.from_pretrained(pretrained_ckpt, cache_dir='./cache_dir') tokenizer = LanguageBindVideoTokenizer.from_pretrained(pretrained_ckpt, cache_dir='./cache_dir') video_process = LanguageBindVideoProcessor(model.config, tokenizer) model.eval() data =...

mask for each image in the Reason-Edit evaluation benchmark

In the [Reason-Edit evaluation benchmark](https://drive.google.com/drive/folders/1QGmye23P3vzBBXjVj2BuE7K3n8gaWbyQ), why is there a corresponding mask for each image, but the mask is not used in test/DS_SmartEdit_test.py?