generative-models How to get the text-to-video model

Exciting work! May I ask where the text-to-video model mentioned and used in the paper can be obtained? I only saw the waitlist to access a new upcoming web. Is there any open source plan?

Nov 22 '23 03:11 WuTao-CS

mkdir checkpoints cd checkpoints

wget https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/resolve/main/svd.safetensors

Nov 22 '23 03:11 crapthings

mkdir checkpoints cd checkpoints wget https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/resolve/main/svd.safetensors?download=true

Thank you, But this is the image to video model, I'm asking the text to video model.

Nov 22 '23 04:11 WuTao-CS

text to video isnt out yet

Nov 22 '23 05:11 Fearblade66

mkdir checkpoints cd checkpoints wget huggingface.co/stabilityai/stable-video-diffusion-img2vid/resolve/main/svd.safetensors?download=true

Thank you, But this is the image to video model, I'm asking the text to video model.

i think its easy to combie diffusers and image to video to do this

Nov 22 '23 08:11 crapthings

Yes I'm I think it's easy to create such pipeline:

generate image using good finetuned sd 1.5
use this as reference image for the image to video model

maybe that's how they do it in the demo video.

Nov 22 '23 13:11 CyberTimon

So has anyone managed to run it? Even image to video?

Nov 22 '23 17:11 vicitooo

The paper they released doesn't indicate that there will be a text-to-video model. It seems the intention is to combine image-to-video models with traditional text-to-image models to generate the initial frame.

From the paper:

Finally, many recent works tackle the task of image-to-video synthesis, where the start frame is already given and the model has to generate the consecutive frames [30, 93, 108]. Importantly, as shown in our work (see Figure 1) when combined with off-the-shelf text-to-image models, image-to-video models can be used to obtain a full text-(to-image)-to-video pipeline.

Nov 25 '23 04:11 dgparker

So has anyone managed to run it? Even image to video?

Yup... it works. After you install the package and prepare the env following the instructions, You need to download the model as mentioned by @crapthings :

mkdir checkpoints
cd checkpoints
wget https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/resolve/main/svd.safetensors

Then run the streamlit, change <your_port> to whatever port you want streamlit run scripts/demo/sampling.py --server.port <your_port>

if you are running this on a remote machine, make sure to
tunnel

Then navigate your browser to: localhost:<your_port>/

Example: run: streamlit run scripts/demo/sampling.py --server.port 8888 Navigate to: localhost:8888/

Nov 29 '23 08:11 gutzcha

@CyberTimon but how would you control what happens in the video ?

Mar 05 '24 19:03 mayank64ce

Hey @mayank64ce, I'm sorry but I can't tell you this. I'm not that experienced with stable video etc..

Mar 05 '24 20:03 CyberTimon

The technical paper on my side is quite misleading regarding the text-to-video part. By default, we assume the codes are aligned with what is claimed but unfortunately, it's currently not the case.

Mar 19 '24 11:03 Mercurise

generative-models generative-models copied to clipboard

How to get the text-to-video model

generative-models
generative-models copied to clipboard