sglang
sglang copied to clipboard
[FEAT] JSON constrained support
Motivation
A lot of llm API (Together AI, fireworks, Anyscale...) and other engines (vllm...) support constrained generation with a JSON schema. As outlines is already a dependency of sglang, it is straightforward to extend its usage to directly support json schema in the API.
Modification
Adding json_schema
parameter (in sampling params for sglang, as an extra parameter for CompetionRequest/ChatCompletionRequest for openai compatible server)
Adding a new FSMJsonCache
for JSON. It inherit FSMCache
, so it functions the same way, but in addition it also stores the regex string converted by outlines. This regex string is required by the Jump Forward Cache.
Adding a unit test, and updating sampling params documentation
Checklist
- [x] Before submitting a PR for review, make sure it has passed verification in your local development environment: limited env with only 24GB VRAM, some out of memory on the test suite but no functional errors.
- [x] Ensure pre-commit
pre-commit run --all-files
or other linting tools are used to fix potential lint issues. - [x] Confirm that modifications are covered by complete unit tests. If not, please add more unit tests for correctness.
- [x] Modify documentation as needed, such as docstrings or example tutorials.