Kai Huang
Kai Huang
We haven't officially supported and benchmarked oneapi 2024.1 yet. Will keep you updated when it is ready :)
Have you add quantize kv cache when you test this?
Updated data from fred: when batch=8, the latency is not reasonable compared with 4/16.
We have reproduced this issue. Seems something wrong with batch 7/8, we are looking into this.
> chatglm3-6b with batch 8 has the same problem The model you report in this issue is already chatglm3-6b. We are fixing it, should be fixed very soon.
> Thanks for your feedback, which deployment tool do you use in local? Hi, I'm using vllm and I suppose the vllm connection should be OK since I can get...
More update here: https://github.com/bytedance/UI-TARS?tab=readme-ov-file#start-an-openai-api-service I can also successfully run the code using OpenAI API with my vllm deployment. The issue should be in the desktop app, I'm using 0.0.7, not...
Sure, I will follow your suggestions. Thanks for your response!
Hi @daker11123 Thanks for your reply! Could you provide more steps/details how you do it? Are you following this guide: https://github.com/bytedance/UI-TARS-desktop/blob/main/CONTRIBUTING.md ? As I understand this only builds the GUI...
@rnwang04 Please take a look at it?