Simon Mo

Results 313 comments of Simon Mo

I think this can be a good idea. Are you thinking about offline evaluation using the `LLM` interface or the server? Thoughts about this? @Yard1 @zhuohan123

@GeauxEric please feel free to open a PR so it's easier to get feedback.

This script can help verify this works end to end https://github.com/vllm-project/vllm/blob/main/examples/multilora_inference.py

> My previous comment is about truncation side, as for various reasons/formats we'd either want to trim from the left or right as well and since it already a parameter...

We now support full range of constrained/guided decoding as powered by Outlines, closing this as completed

If anyone have bandwidth to help us implement ChatGLM support, please leave a comment and coordinate here: https://github.com/vllm-project/vllm/issues/1552