sglang icon indicating copy to clipboard operation
sglang copied to clipboard

[Feature] Use xgrammar as default grammar backend to aviod I/O errors while using Outlines in a multi-node setting

Open shuaills opened this issue 10 months ago • 1 comments

Checklist

  • [x] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
  • [x] 2. Please use English, otherwise it will be closed.

Motivation

related issues: #3375 related discussiton: #vllm 4193 related pr: https://github.com/sgl-project/sglang/pull/3379

Related resources

xGrammar stores its cache in RAM instead of disk, avoiding file system conflicts. Cache size is small (typically <0.5MB per schema), meaning it doesn't require persistent disk storage. xGrammar is thread-safe, ensuring it can run across multiple Slurm nodes without concurrency issues.

shuaills avatar Feb 07 '25 23:02 shuaills

Hey @shuaills, can i pick this up?

adi-kmt avatar Feb 17 '25 04:02 adi-kmt

I'd like to work on this, too~

liusy58 avatar Mar 05 '25 03:03 liusy58

You can deactivate Outlines' cache using:

from outlines import caching
cachine.disable_cache()

And use functool's lru_cache instead.

rlouf avatar Mar 13 '25 15:03 rlouf

See #6601 since we updated to the default to be xgrammar alreaadya in the meantime, but one docs mention doesn't use it

vincentzed avatar May 26 '25 00:05 vincentzed