Simon Mo
Simon Mo
Being worked on in #4558 and #4330
I think the reason for the pin is exactly the error here in test... ``` _________ ERROR collecting tests/entrypoints/test_guided_processors.py _________ -- | ImportError while importing test module '/vllm-workspace/tests/entrypoints/test_guided_processors.py'. | Hint:...
> I don't really see why vllm needs its own "copy" of the outlines code, rather than just importing it? This is recommended by the outlines maintainers for us to...
Hmmm. It's generating '3.5.0.10653515246264', maybe tune the temperature or prompting? https://buildkite.com/vllm/ci/builds/6481#018f3d50-d8eb-4258-8260-ef486c1466ca/51-578
Oh some useful things to track in observability metrics are usage/performance of lora adapters, automatic prefix caching, chunked prefill, and spec decode acceptance rate, etc.
@nunjunj will work on this
Great list!
+1 if it works with our stats logger this is so good!
@JustinLin610 can you help take a look at this? Thank you! 🙏
This is quite interesting. Can you double check by setting `seed`?