tensorrtllm_backend Inference VILA 3b

Inference VILA 3b

Open anhnhust opened this issue 10 months ago • 0 comments

I try to inference model VILA 3b, but i meet this errror when i run: python3 scripts/launch_triton_server.py --world_size 1 --model_repo=multimodal_ifb/ --tensorrt_llm_model_name tensorrt_llm,multimodal_encoders --multimodal_gpu0_cuda_mem_pool_bytes 300000000 I using container: nvcr.io/nvidia/tritonserver:24.11-trtllm-python-py3 transformers 4.43.4 tensorrt_llm 0.15.0

Dec 22 '24 04:12 anhnhust

tensorrtllm_backend tensorrtllm_backend copied to clipboard

Inference VILA 3b

tensorrtllm_backend
tensorrtllm_backend copied to clipboard