Open-Sora can I Inference with 2 nvidia-4090?

Hi,

Thanks for opensora! I am trying Inference on my workstation (with 2 nvidia-4090) as the following:

(my cli-command): torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path ./checkpoint/OpenSora-v1-HQ-16x256x256.pth

(my testing):

if I keep the current config(in 16x256x256.py: samples = 16) then I got "GPU no-enough-memory" error-message.
if I change config to sample = 4 then I could generate the output sample.mp4 successfully: but the image-quality is not good (I am using the same prompt "A serene night scene in a forested area..." and I can see some buildings and cars in my output-sample, but the video-quality is much lower than your sample).

my questions are:

can I run Inference on nvidia-4090? if 1 nvidia-4090(24GB) memory is not enough can I run Inference on 2 nvidia-4090?
how can I get the high output-sample quality (same as your samples)? can I get the high-quality output-video by your Model Weights (OpenSora-v1-HQ-16x256x256.pth)? or I need to continue to train it?

Thanks!

(my output sample with OpenSora-v1-HQ-16x256x256.pth and samples = 4):

https://github.com/hpcaitech/Open-Sora/assets/163908077/43b368b9-328f-4518-9b78-585ebdca16d5

Mar 18 '24 19:03 minounou

hey @minounou this is unrelated to you question, but would you mind sharing your torch, torchvision, and nvidia-drivers versions? I also have a 4090 but am having trouble getting my install to work.

thanks

Mar 18 '24 19:03 jstmn

@jstmn

OK sure here is my config (my cuda is 12.2, I have 2 nvidia-4090, but it seems I only can use 1 of them now):

pytorch... conda create -n opensora20240317 python=3.10 conda activate opensora20240317 conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

install flash attention pip install packaging ninja pip install flash-attn --no-build-isolation

install apex (apex: for this I need to comment setup.py line-39, so 12.2-cuda and my pytorch coould work together) git clone https://github.com/NVIDIA/apex.git cd apex (update setup.py line-39) pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./

install xformers pip install -U xformers --index-url https://download.pytorch.org/whl/cu121

install this project git clone https://github.com/hpcaitech/Open-Sora cd Open-Sora pip install -v .

Model Weights: download T5 and OpenSora Model Weights (in github-page)

Inference: (update sample from 16 to 4: otherwise got GPU not-enough-mem error): torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path ./checkpoint/OpenSora-v1-HQ-16x256x256.pth

Mar 18 '24 20:03 ghost

Same question for one NVIDIA RTX 4090, as this might be the configuration for most casual user and some universities.

Mar 19 '24 02:03 c4fun

我用一块3090，可以跑16x256x256和16x512x512的，batchsize设成1

Mar 19 '24 07:03 tanghaom

我用一块3090，可以跑16x256x256和16x512x512的，batchsize设成1

只是infer么，finetune的话是不是不太够了

Mar 20 '24 09:03 byhwhite

我用一块3090，可以跑16x256x256和16x512x512的，batchsize设成1

只是infer么，finetune的话是不是不太够了

是的呢

Mar 20 '24 09:03 tanghaom

@tanghaom batchsize设成1后、现在可以生成16x256x256、谢谢！

https://github.com/hpcaitech/Open-Sora/assets/163908077/d64c1465-362d-4c9e-9c55-1bfb22dd39a4

Mar 23 '24 22:03 ghost

https://github.com/hpcaitech/Open-Sora/assets/13972782/9c48b0f0-1f25-4f89-afdb-5a28aefca982

Prompt: A soaring drone footage captures the majestic beauty of a coastal cliff, its red and yellow stratified rock faces rich in color and against the vibrant turquoise of the sea. Seabirds can be seen taking flight around the cliff's precipices. As the drone slowly moves from different angles, the changing sunlight casts shifting shadows that highlight the rugged textures of the cliff and the surrounding calm sea. The water gently laps at the rock base and the greenery that clings to the top of the cliff, and the scene gives a sense of peaceful isolation at the fringes of the ocean. The video captures the essence of pristine natural beauty untouched by human structures.

https://github.com/hpcaitech/Open-Sora/assets/13972782/a8ac6e18-301c-4e97-ac07-e1e3e53a7cae

Prompt: A basketball bouncing on a grey sidewalk

It seems like short prompts are a lot less accurate / realistic

this is with num_frames=8, 16x256x256, batch_size=1 - the maximum I can do on my 4090

Mar 27 '24 23:03 jstmn

Open-Sora Open-Sora copied to clipboard

can I Inference with 2 nvidia-4090?

Open-Sora
Open-Sora copied to clipboard