sglang icon indicating copy to clipboard operation
sglang copied to clipboard

[FEAT] JSON constrained support

Open havetc opened this issue 6 months ago • 1 comments

Motivation

A lot of llm API (Together AI, fireworks, Anyscale...) and other engines (vllm...) support constrained generation with a JSON schema. As outlines is already a dependency of sglang, it is straightforward to extend its usage to directly support json schema in the API.

Modification

Adding json_schema parameter (in sampling params for sglang, as an extra parameter for CompetionRequest/ChatCompletionRequest for openai compatible server)

Adding a new FSMJsonCache for JSON. It inherit FSMCache, so it functions the same way, but in addition it also stores the regex string converted by outlines. This regex string is required by the Jump Forward Cache.

Adding a unit test, and updating sampling params documentation

Checklist

  • [x] Before submitting a PR for review, make sure it has passed verification in your local development environment: limited env with only 24GB VRAM, some out of memory on the test suite but no functional errors.
  • [x] Ensure pre-commit pre-commit run --all-files or other linting tools are used to fix potential lint issues.
  • [x] Confirm that modifications are covered by complete unit tests. If not, please add more unit tests for correctness.
  • [x] Modify documentation as needed, such as docstrings or example tutorials.

havetc avatar Aug 16 '24 13:08 havetc