Znsert
Znsert
> 看一下这个页面 https://github.com/LC1332/Chat-Haruhi-Suzumiya/tree/main/characters/novel_collecting 谢谢大佬,我再看一下
> You misspelled the parameter name, it should be `--gpu-memory-utilization` with a "z" in it yeah, i misspelled here but its correct in my script, opposite a error will occurs...
> Can you show the full command you used? `vllm-serve.sh` vllm serve /home/nvidia/zsx/ckpt/Qwen2.5-VL-7B-Instruct \ --tensor-parallel-size 1 \ --gpu-memory-utilization 0.8 \ --max-model-len 4096 \  ps: my machine is jetson agx...
> If you run `nvidia-smi` it should show that only 20% of the memory is used unfortunately the result of nvidia-smi is same than jtop one, meaning it takes always...
> Can you show how you're launching vLLM? Are you using command line directly or are you launching it from another script? In my previous comment, I mentioned that I...
> I see you set both `--num-gpus` and `--tensor-parallel-size`. But `--num-gpus` isn't a parameter in vLLM, so an error should result from that. > > **Edit:** Just noticed that you're...
> Just to check whether `CUDA_VISIBLE_DEVICES` is working properly, can you try importing vanilla PyTorch and see if idle memory is allocated in the correct GPUs? ofc, it allocate correctly...
> cc [@youkaichao](https://github.com/youkaichao) finally i figure out this issue by using `-tp` instead of `--tensor-parallel-size` which does not work at all!!!
> > cc [@youkaichao](https://github.com/youkaichao) > > finally i figure out this issue by using `-tp` instead of `--tensor-parallel-size` which does not work at all!!! so its a same problem with...
> I suggest updating vLLM to see if the problem goes away but vllm version is already the latest one (0.8.5.post1)