Simon Mo issues

Results 57 issues of


                                            Simon Mo

[CI] Add Buildkite

[Feature]: Integrate with lm-format-enforcer

### 🚀 The feature, motivation and pitch While existing Outline state machine provide great state of the art performance, it is trading off a one-off compile time when working with...

feature request

[Feature]: A instruction/chat method for offline LLM class.

### 🚀 The feature, motivation and pitch We currently do not apply chat template for the offline `LLM` class. It might be useful to provide similar interface as Huggingface chat...

feature request

[Feature]: Distribute sets of default chat template for models do not provide one

### 🚀 The feature, motivation and pitch Thanks to our amazing community, we have gathered a set of good chat template for models. These template are useful when the original...

feature request

[Feature]: Update Outlines Integration from `FSM` to `Guide`

### 🚀 The feature, motivation and pitch Recently outlines updated their interface from FSM to Guide to support "acceleration"/"fast-forward" which will output next sets of tokens if they are directly...

feature request

[Feature]: Integrate with AICI

### 🚀 The feature, motivation and pitch #2888 added a prototype for AI Controller Interface, which is a WASM based runtime for guided generation. We would like to integrate this...

feature request

[CI] Use Skypilot to launch model test in runpod for A100 access

### Anything you want to discuss about vllm. Current we do not run model test on A100 machine because we can't get any capacity in GCP. https://skypilot.readthedocs.io/ supports runpod and...

misc

Add latency metrics

After #1662 (initial metrics support) and #1756 (refactoring chat endpoint), it will become practical to include latency metrics that's important to production (courtesy of @Yard1): * histogram of time to...

help wanted

good first issue

Support `tools` and `tool_choice` parameter in OpenAI compatible service

Also aliased as `functions` and `function_call` in deprecated parameters. https://platform.openai.com/docs/api-reference/chat/create#chat-create-tools After #1756 is merged (thanks @Tostino!), it should be straightforward to add this as a core parameter to OpenAI compatible...

help wanted

good first issue

Enable mypy type checking

### Anything you want to discuss about vllm. Even though vLLM is type annotated but we did not enable type checking. It would be useful to add it, even incrementally.

misc