slr1997
slr1997
waiting for the solution too
waiting for the new prompts too
I just provide a page of paper for translate, and it comes with this error: `CUDA_VISIBLE_DEVICES=2,3 python -m sglang.launch_server --model-path Qwen/Qwen2.5-VL-7B-Instruct --tp-size 2 --dp-size 1 --host 0.0.0.0 --port 4321 --mem-fraction-static...
@mickqian I just sent a image to it with open-WebUI. It broke sometimes with the above error, while sometimes went well. Later I would reproduce it and provide the debug-level...
@zh-jp Did you test the speed compared with the llama.cpp? And how much memory does it need at least?