Kai Huang comments

Results 136 comments of


                                            Kai Huang

Benchmark latency different between oneAPI2024.0 and 2024.1

We haven't officially supported and benchmarked oneapi 2024.1 yet. Will keep you updated when it is ready :)

all-in-one benchmark tool benchmark issue, the latency of next token with large batch is too high

Have you add quantize kv cache when you test this?

all-in-one benchmark tool benchmark issue, the latency of next token with large batch is too high

Updated data from fred: when batch=8, the latency is not reasonable compared with 4/16.

all-in-one benchmark tool benchmark issue, the latency of next token with large batch is too high

We have reproduced this issue. Seems something wrong with batch 7/8, we are looking into this.

all-in-one benchmark tool benchmark issue, the latency of next token with large batch is too high

> chatglm3-6b with batch 8 has the same problem The model you report in this issue is already chatglm3-6b. We are fixing it, should be fixed very soon.

[Bug]: Help needed - Local vllm deployment doesn't receive the request from desktop

> Thanks for your feedback, which deployment tool do you use in local? Hi, I'm using vllm and I suppose the vllm connection should be OK since I can get...

[Bug]: Help needed - Local vllm deployment doesn't receive the request from desktop

More update here: https://github.com/bytedance/UI-TARS?tab=readme-ov-file#start-an-openai-api-service I can also successfully run the code using OpenAI API with my vllm deployment. The issue should be in the desktop app, I'm using 0.0.7, not...

[Bug]: Help needed - Local vllm deployment doesn't receive the request from desktop

Sure, I will follow your suggestions. Thanks for your response!

[Bug]: Help needed - Local vllm deployment doesn't receive the request from desktop

Hi @daker11123 Thanks for your reply! Could you provide more steps/details how you do it? Are you following this guide: https://github.com/bytedance/UI-TARS-desktop/blob/main/CONTRIBUTING.md ? As I understand this only builds the GUI...

Fix llama

@rnwang04 Please take a look at it?