jessie-zhao

Results 6 issues of jessie-zhao

hi ALL. I am running Deepstream6.1 on A10 on ubuntu20.04, when run yolov5s model with int8 calbiratio, got below issue. can someone help with this #deepstream-app -c ./deepstream_app_config.txt 。。。 Total...

模型列表: • https://huggingface.co/Nanbeige/Nanbeige2-8B-Chat • https://huggingface.co/Nanbeige/Nanbeige2-16B-Chat • https://huggingface.co/codellama/CodeLlama-34b-hf 测试标准 SLO: 进行并发请求测试,限制 TTFT 和 TPOT 测试最大并发 case 1: • 输入 4096 输出 1024 • TTFT: 3s, TPOT: 100ms case 2: • 输入...

user issue
multi-arc

Run vllm serving test on ARC with below issue: NFO 07-04 19:10:08 async_llm_engine.py:152] Aborted request cmpl-e5fb5cad96e9402dabbbece3611ae22f-0. INFO: 127.0.0.1:41772 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error ERROR: Exception in ASGI...

user issue

Faced OOM on Arc with 6k input/512 out with VLLM serving, Mode: ChatGLM3-bB, Qwen1.5-32B on 4 ARC

user issue

Got only 10 parallel request on 2 Arc with Qwen1.5 model (1024 input/512 out), could you please to improve the performance?

user issue

用以下方式验证glm4-9b-chat模型的输出,serving端报错 curl --request POST \ --url http://127.0.0.1:8000/v1/chat/completions \ --header 'content-type: application/json' \ --data '{ "model": "glm-4-9b-chat", "temperature": 0.7, "top_p": 0.8, "messages": [ { "role": "system", "content": "Below is an instruction...

user issue
multi-arc