harrywu comments

Results 12 comments of


                                            harrywu

Add latency metrics

@hmellor hello, It seems that the discussion has moved to another place. Can this issue be closed？

@robertgshaw2-neuralmagic I just replicating the new metrics that were added [here](https://github.com/ronensc/vllm/pull/1). should I add some automated tests, or just visually look at the metrics？Thanks!

[Metrics] add more metrics

> Also, looks like you are missing the number of tokens in the batch for each iteration of the LLMEngine you mean `num_generation_tokens_iter`? It already exists.

[Metrics] add more metrics

https://github.com/vllm-project/vllm/issues/3616#issuecomment-2030858781 @robertgshaw2-neuralmagic BTW, Should some metrics be recorded with a periodic bypass process/thread, such as every 1 second.

[Metrics] add more metrics

@robertgshaw2-neuralmagic Thanks for attention! I made some changes to the dashboard. Categorize the charts. But I'm not sure I'm showing some metrics the right way, such as: - request_params_n -...

[Metrics] add more metrics

@robertgshaw2-neuralmagic Hi, will it still be merged before friday?

[Metrics] add more metrics

@robertgshaw2-neuralmagic ping~

Bug convert HF model

@vinhtran2611 I have set `AutoModelForCausalLM.from_pretrained(torch_dtype=torch.bfloat16)` and `_load_model(precision=torch.bfloat16)` But I get `hf_outputs.logits.dtype == torch .float32` and `output.dtype == torch.bfloat16`. Maybe it's the precision problem.

[Bug]: 登录失败

同样的问题，不过我是web协议

[Metrics] add more metrics

@robertgshaw2-neuralmagic Hello~ I see it's been approved for a while, but it is not merged yet. Anything wrong?