Mark McLoughlin comments

Results 50 comments of


                                            Mark McLoughlin

[V1][Metrics] Add API for accessing in-memory Prometheus metrics

> the only request is to add plus 1 to the mean acceptance length since one token will always be accepted. so mean acceptance length is essentially "average number of...

[V1][Metrics] Add API for accessing in-memory Prometheus metrics

> Hi @markmc, as far as I know, all speculative decoding literatures reporting acceptance length includes the bonus token since this quantity aligns with "number of tokens generated per forward...

[V1][Metrics] Add API for accessing in-memory Prometheus metrics

I've pushed an update that I'm not super happy with To handle the case of DP where we have multiple sets of metrics identified by `engine_idx`, I've had to do...

[V1][Metrics] Add API for accessing in-memory Prometheus metrics

> @markmc Is this PR waiting for review? Or is it in progress? It is waiting for review

[V1][Metrics] Add API for accessing in-memory Prometheus metrics

> LGTM, @markmc could you just double check if the CI failure is related so that we can merge this PR? Yes, AFAICT all of these failures are happening on...

[V1][Metrics] Add API for accessing in-memory Prometheus metrics

> @markmc Can you please merge from main again? Done. I don't think the rebase resolves any of the test failures, but I could be wrong

[V1][Metrics] Add API for accessing in-memory Prometheus metrics

Ok, the docs failure was a genuine - but hard-to-spot - issue with the PR ``` vllm/docs/source/serving/engine_args.md:14: ERROR: Failed to import "_engine_args_parser" from "vllm.engine.arg_utils". No module named 'prometheus_client' ```

Fix podman+selinux compatibility

`start_container.sh` needs this too, perhaps with `:Z` for the model checkpoints dir since it would be shared between containers?

Fix podman+selinux compatibility

> `start_container.sh` needs this too, perhaps with `:Z` for the model checkpoints dir since it would be shared between containers? Also, `build_container.sh` - and maybe for that, since it is...

[WIP] Add a metric to track request failures

Couple of points: - For new metrics, the priority should be to add them in V1 since V0 will shortly be deprecated - Are `FINISHED_ABORTED` requests already counted under `request_success_total[length]`...