youngjae-you
Results
73
comments of
youngjae-you
@Louym Does tinychat not support multi-GPU load inference?
@Louym When running the 8b-video model on a 4090 GPU, I'm getting an OOM (Out of Memory) error. Would it be okay to proceed with AWQ quantization after reducing the...
> Thank you for reaching out. According to my experience, you can run W4A16 quantized NVILA 8B model with no more than 128 frames on a single 4090 GPU. Are...