LTX-Video icon indicating copy to clipboard operation
LTX-Video copied to clipboard

Distilled 9.6How many VRAMs are needed and how long can a video be generated

Open libai-lab opened this issue 8 months ago • 9 comments

Distilled 9.6How many VRAMs are needed and how long can a video be generated

libai-lab avatar Apr 19 '25 07:04 libai-lab

Not gonna work with 6gb of vram for sure

607mark avatar Apr 20 '25 19:04 607mark

works with 8. just do lower res and start with something like 320x240 with 15fps target with 33 frames , use tiled vae with tile size lower than 512.

patientx avatar Apr 20 '25 22:04 patientx

Not gonna work with 6gb of vram for sure

In my experience 6GB is totally fine though. Been able to do more than 100 frames with 480p resolution. Perhaps ComfyUI's memory management is better than this repo's, but in a sense ComfyUI is also an official implementation (the plugin is fully maintained by LightTricks themselves).

able2608 avatar Apr 21 '25 02:04 able2608

Not gonna work with 6gb of vram for sure

In my experience 6GB is totally fine though. Been able to do more than 100 frames with 480p resolution. Perhaps ComfyUI's memory management is better than this repo's, but in a sense ComfyUI is also an official implementation (the plugin is fully maintained by LightTricks themselves).

You're right, it actually runs quite well with Comfyui on just 6gb of VRAM, which honestly surprised me. Was expecting the opposite.

607mark avatar Apr 21 '25 14:04 607mark

Even 128x 128 x 30 is causing OOM on a 24GB A5000. I think without ComfyUI it's impossible to run locally. Maybe a card with >32GB can work

AvirupJU avatar Apr 22 '25 16:04 AvirupJU

I guess that most of the VRAM burden is on text encoding with those heavy t5 models (on my Comfy setup using the vanilla checkpoint without any memory optimization tricks it pretty much never OOM on my 6gb card). Not pretty sure if the official repo automatically unload them when not in use (which is essential for optimizing memory use), but on the other hand ComfyUI supports quantized clip models, which really makes a huge difference on VRAM requirement.

able2608 avatar Apr 22 '25 17:04 able2608

GTX 1650 Mobile 4GB Vram 32 GB Ram 768x512 resolution, 49 frames, 24 fps, 8 steps, with google_t5-v1_1-xxl_encoderonly-fp16 , Generation taking 1 minute exactly. Times double when double dimension or double frames. It was possible to make 1024x768, 129 frames video in roughly 10 minutes. Extremely impressive, time taken to generate initial image is longer. Black magic.

EduardVoy avatar Apr 25 '25 03:04 EduardVoy

Even 128x 128 x 30 is causing OOM on a 24GB A5000. I think without ComfyUI it's impossible to run locally. Maybe a card with >32GB can work

I havent been able to get this to run just using python inference, 512x512, 128 frames, 24 fps. I run out of memory on a 32G RTX 5090 even though 31G was free. I havent used ComfyUI before. I was looking to build a workflow using just python.

MrEdwards007 avatar May 11 '25 18:05 MrEdwards007

Using Distilled 9.6 on 8 GB VRAM, with high resolution and 249 frames. takes total 4 minutes. Should work on 6 GB.

nitinmukesh avatar May 11 '25 19:05 nitinmukesh