Shiwei Zhang comments

Repositories
Issues
Comments

Results 73 comments of


                                            Shiwei Zhang

I want to train a text-to-video model, where is ’stable_diffusion_image_key_temporal_attention_x1.json‘ in 't2v_train.yaml'?

Sorry we missed this file, it has been updated and is available [here](https://github.com/ali-vilab/i2vgen-xl/blob/main/data/stable_diffusion_image_key_temporal_attention_x1.json) for your reference.

I2V architecture

Thank you for your interest in our work. 1. Yes, we are using the LDM method for videos. 2. In the base stage, we input the image (extracting CLIP features...

Best text condition to generate video which contains image rotation 360 degrees.

Hi. We in fact haven't tried generating this motion pattern from text before. Sorry about that.