generative-models SVD vram requirement

It would be cool to write SVD VRAM requirement.

I tried in on a 4090 and ended up with

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.94 GiB. GPU 0 has a total capacty of 23.99 GiB of which 0 bytes is free. Of the allocated memory 17.76 GiB is allocated by PyTorch, and 1.34 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Nov 21 '23 21:11 maciekpoplawski

Ok.... i was devastated :D but in the end i go 14 13 10 5 ... 2.... 1.....

decoding_t: int = 1,  # Number of frames decoded at a time! This eats most VRAM. Reduce if necessary.

https://github.com/Stability-AI/generative-models/assets/54249329/d48e8268-cd4a-4406-a377-7257d8f4cfe4

And I GOT IT :D

Nov 21 '23 22:11 maciekpoplawski

almost :D mp4 codec is broken :D

Nov 21 '23 22:11 maciekpoplawski

and this is my fav image :)

Nov 21 '23 22:11 maciekpoplawski

any ideas how to fix this? :( OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)' OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'

Nov 21 '23 22:11 maciekpoplawski

I've gotten "some" sort of movie by changing the resolution of the movie down to 256x256....but it's ugly...I'm not sure what I'm doing right now, but try that if you are VRAM limited. Movie looks like it hasn't completely diffused, it's very ugly, but the input image jumps around like its some shaky cam movie.

Not seeing how to specify an "action" in the streamlit demo, just an image...

Nov 21 '23 22:11 SpaceCowboy850

any ideas how to fix this? :( OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)' OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'

Install ffmpeg, e.g.

sudo apt update
sudo apt install -y ffmpeg

Edit: I also reinstalled opencv, though not sure if it was necessary

source .pt2/bin/activate
pip3 install -I opencv-python==4.6.0.66

Nov 21 '23 22:11 jchook

Nov 22 '23 05:11 crapthings

It would be cool to write SVD VRAM requirement.

I tried in on a 4090 and ended up with

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.94 GiB. GPU 0 has a total capacty of 23.99 GiB of which 0 bytes is free. Of the allocated memory 17.76 GiB is allocated by PyTorch, and 1.34 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

My graphics card is also 4090. Have you resolved it? Is it running successfully?

Nov 22 '23 06:11 loushengtao

any ideas how to fix this? :( OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)' OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'

Install ffmpeg, e.g.
sudo apt update
sudo apt install -y ffmpeg
Edit: I also reinstalled opencv, though not sure if it was necessary
source .pt2/bin/activate
pip3 install -I opencv-python==4.6.0.66

im on widnows :(

Nov 22 '23 07:11 maciekpoplawski

Thank you :) You made my day ❤️

Nov 22 '23 07:11 maciekpoplawski

Has anyone here been able to make this run on a "regular" Windows machine that is not some heavy duty GPU server? Don't laugh but I was trying to get this to work on my A4500 laptop and tried everything (including lowering the decoding_t to 1 but it's taking forever (I started running one and it's been about 30 minutes now and getting nothing so far).

Also, how long did it take to generate the images on 3090 or 4090,?

I've tried using the HugggingFace space (A100) and it was fast enough https://huggingface.co/spaces/multimodalart/stable-video-diffusion and I'm sure we'll get there someday with consumer laptops, but probably not today.

Personally interested in running this on "normies' computers" (think RTX A4500, etc or maybe even 3090 and 4090), and trying to gauge whether this is even an idea worth pursuing at this stage, or whether the right move is to go the quantized route and wait. Any kind of insight would be helpful, please share, thank you!

I was able tu run in on 4090 (the smaller version using decoding_t: int = 1 (default is 14) parameter in the script. But it's maxing 24gb vram. Generation time was not horrible. Idk under a minute?

Nov 22 '23 07:11 maciekpoplawski

WTF ,80GB vram OMG

Nov 22 '23 08:11 D-Mad

apt update
apt install -y ffmpeg

git clone https://github.com/Stability-AI/generative-models
cd generative-models

mkdir checkpoints
cd checkpoints
wget https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/resolve/main/svd.safetensors

cd ../

pip install -r requirements/pt13.txt
PYTHONPATH=. streamlit run scripts/demo/video_sampling.py --server.port 8005

wget https://bin.equinox.io/c/bNyj1mQVY4c/ngrok-v3-stable-linux-amd64.tgz
tar -xvzf ngrok-v3-stable-linux-amd64.tgz

./ngrok config add-authtoken "---your token here---"
./ngrok http 8005

if you running on run pod, this is the step

you should open port 8005, and pick the template pytorch 2.1

update: you don't have to open 8005, if u use ngrok.

it looks streamlit use websocket, open port is not enough, will have websocket error(no ws reverse proxy maybe), so i use ngrok here

Nov 22 '23 08:11 crapthings

WTF ,80GB vram OMG

48g like a40, rtx6000 is okay too, if u lower the decode t frame, i haven't tried it on 24gb

if u set decode t frame too high, a100 stil running out of memory

Nov 22 '23 08:11 crapthings

https://github.com/Stability-AI/generative-models/assets/1147704/5c90fcf0-9685-4293-9288-e6a8c1fbf2dc

https://github.com/Stability-AI/generative-models/assets/1147704/df0ab575-508d-49a4-95a4-33e62c249ece

https://github.com/Stability-AI/generative-models/assets/1147704/3e2e13db-97c7-4e9f-ab68-2ae8979073eb

https://github.com/Stability-AI/generative-models/assets/1147704/006523af-4574-49c8-824e-505d368e18c5

https://github.com/Stability-AI/generative-models/assets/1147704/ca535dfa-bed0-4b70-b3e4-81ccf25b1073

Nov 22 '23 08:11 crapthings

WTF ,80GB vram OMG

48g like a40, rtx6000 is okay too, if u lower the decode t frame, i haven't tried it on 24gb

if u set decode t frame too high, a100 stil running out of memory

i can run on RTX4090 with decode=2 and can't edit any more :D Untitled

Nov 22 '23 08:11 D-Mad

000004_mbid_220_seed_28970 — kopia aw i forgot to loop the gif. Sry.

Nov 22 '23 10:11 maciekpoplawski

Does anyone of you know more info about motion_bucket_id ? what are they? can i select them not randomly. Do they mean something? :)

Nov 22 '23 11:11 maciekpoplawski

64b61aa7ed9a35631d170b29bff3c3fdee73ffd49140cb97e63be656.mp4 33f181eef446d5e08dc02f07f6a4eab4bed45aa9aed5ea67f6532371.mp4 be5e7c605fd4a1ade877355b676085828a47a16c39c1ea73d4ee0a6d.mp4 3e79f764296c8b627900e1836906b87b72f257c6cf039b292e72a3fd.mp4 9663d3f2b932b92ae49e47d68a1e7f4f442c49fa2aada05791d77bc6.mp4

whats your bucket_id?Some are more inclined towards camera motion, while others involve scene motion.

Nov 22 '23 11:11 loushengtao

https://github.com/Stability-AI/generative-models/assets/48057231/e64960fe-5f1e-4903-a02f-0c787ea66b4e

https://github.com/Stability-AI/generative-models/assets/48057231/6db693ff-49e0-431a-9b43-bcd0003d869a

the first one is 128 and the second one is 64

Nov 22 '23 11:11 loushengtao

User named Kijai posted this on SD Discord: would be testing it later!

Nov 22 '23 13:11 maciekpoplawski

User named Kijai posted this on SD Discord: would be testing it later!

this is for the UI (i didn't knew there is one)

cd scripts/demo streamlit run video_sampling.py

Nov 22 '23 14:11 maciekpoplawski

also 4090 failed...

Nov 23 '23 13:11 zcfrank1st

also 4090 failed...

check my previous post and set lowvram_mode to True and launch UI by cd scripts/demo streamlit run video_sampling.py

Nov 23 '23 13:11 maciekpoplawski

also 4090 failed...

check my previous post and set lowvram_mode to True and launch UI by cd scripts/demo streamlit run video_sampling.py

OK，I will try it. Thank you !

Nov 23 '23 13:11 zcfrank1st

also 4090 failed...

check my previous post and set lowvram_mode to True and launch UI by cd scripts/demo streamlit run video_sampling.py

OK，I will try it. Thank you !

also decoding_t is the most important parameter for now. It have a default on 14 (your vram would be eaten like a pretzel) -> change it to 1 and if you want go higher till OOME

Nov 23 '23 13:11 maciekpoplawski

any ideas how to fix this? :( OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)' OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'

Hey, bro. Has this problem been solved? I'm also on the Windows platform, rtx4090. ffmpeg version: 6.0-essentials opencv-python version: 4.6.0.66

Nov 25 '23 05:11 hx3333

any ideas how to fix this? :( OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)' OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'

Hey, bro. Has this problem been solved? I'm also on the Windows platform, rtx4090. ffmpeg version: 6.0-essentials opencv-python version: 4.6.0.66

oh yeah, I solved this problem by changing cv2.VideoWriter_fourcc(*"MP4V") to cv2.VideoWriter_fourcc('m', 'p', '4', 'v') through modification.

Nov 25 '23 05:11 hx3333

LOL and this solved the issue? Congrats!!!

Wysłano z programu Outlook dla systemu Androidhttps://aka.ms/AAb9ysg

From: hx3333 @.> Sent: Saturday, November 25, 2023 6:46:00 AM To: Stability-AI/generative-models @.> Cc: Maciek Popławski @.>; Author @.> Subject: Re: [Stability-AI/generative-models] SVD vram requirement (Issue #140)

any ideas how to fix this? :( OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)' OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'

Hey, bro. Has this problem been solved? I'm also on the Windows platform, rtx4090. ffmpeg version: 6.0-essentials opencv-python version: 4.6.0.66

oh yeah, I solved this problem by changing cv2.VideoWriter_fourcc(*"MP4V") to cv2.VideoWriter_fourcc('m', 'p', '4', 'v') through modification.

— Reply to this email directly, view it on GitHubhttps://github.com/Stability-AI/generative-models/issues/140#issuecomment-1826223889, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM54O4IKZJTNBAH2YLRPHNDYGGAZRAVCNFSM6AAAAAA7VIINTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRWGIZDGOBYHE. You are receiving this because you authored the thread.Message ID: @.***>

Nov 25 '23 07:11 maciekpoplawski

To all of you ending up in this issue I highly recommend checking out newest version of ComfyUI and their last blog post where you would find example workflows for SVD and SVD XT. I heard that somebody was able to run XT version on 8GB VRAM.

Wysłano z programu Outlook dla systemu Androidhttps://aka.ms/AAb9ysg

From: hx3333 @.> Sent: Saturday, November 25, 2023 6:46:00 AM To: Stability-AI/generative-models @.> Cc: Maciek Popławski @.>; Author @.> Subject: Re: [Stability-AI/generative-models] SVD vram requirement (Issue #140)

any ideas how to fix this? :( OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)' OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'

Hey, bro. Has this problem been solved? I'm also on the Windows platform, rtx4090. ffmpeg version: 6.0-essentials opencv-python version: 4.6.0.66

oh yeah, I solved this problem by changing cv2.VideoWriter_fourcc(*"MP4V") to cv2.VideoWriter_fourcc('m', 'p', '4', 'v') through modification.

— Reply to this email directly, view it on GitHubhttps://github.com/Stability-AI/generative-models/issues/140#issuecomment-1826223889, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM54O4IKZJTNBAH2YLRPHNDYGGAZRAVCNFSM6AAAAAA7VIINTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRWGIZDGOBYHE. You are receiving this because you authored the thread.Message ID: @.***>

Nov 25 '23 07:11 maciekpoplawski