youngjae-you

Results 73 comments of youngjae-you

@Louym Does tinychat not support multi-GPU load inference?

@Louym When running the 8b-video model on a 4090 GPU, I'm getting an OOM (Out of Memory) error. Would it be okay to proceed with AWQ quantization after reducing the...

> Thank you for reaching out. According to my experience, you can run W4A16 quantized NVILA 8B model with no more than 128 frames on a single 4090 GPU. Are...