sglang icon indicating copy to clipboard operation
sglang copied to clipboard

fix xgrammar_backend crash with malformed inputs

Open gongwei-130 opened this issue 1 month ago • 2 comments

Motivation

Summary

Add two helper sanitizers to python/sglang/srt/constrained/xgrammar_backend.py that normalize both legacy structures arrays and modern format payloads by replacing missing schema / json_schema values with {}. Invoke the sanitizers before invoking StructuralTagItem or compiling the structural tag so malformed inputs no longer trigger JSON decode / xgrammar crashes.

repro steps:

cmd to launch server

python -m sglang.launch_server \
  --model-path moonshotai/Kimi-K2-Thinking \
  --revision 612681931a8c906ddb349f8ad0f582cb552189cd \
  --port 17345 \
  --host 0.0.0.0 \
  --tensor-parallel-size 8 \
  --trust-remote-code \
  --tool-call-parser kimi_k2 \
  --reasoning-parser kimi_k2 \
  --cuda-graph-max-bs 64 \
  --enable-metrics \
  --context-length 262144 \
  --model-loader-extra-config '{"enable_multithread_load": true}' \

query to trigger crash

curl http://127.0.0.1:17345/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "moonshotai/Kimi-K2-Thinking",
    "stream": false,
    "messages": [
      {
        "role": "system",
        "content": "You are an expert content summarizer and must follow the exact Markdown format requested."
      },
      {
        "role": "user",
        "content": "Swipe One is a comprehensive AI-driven marketing automation platform ..."
      }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "copy",
          "description": "Copy text content to the clipboard",
          "parameters": {
            "type": "object",
            "properties": {
              "content": {
                "type": "string",
                "description": "Text to copy"
              }
            },
            "required": ["content"]
          }
        }
      }
    ],
    "response_format": {
      "type": "structural_tag",
      "structures": [
        {
          "begin": "<tool=example>",
          "schema": null,
          "end": ""
        }
      ],
      "triggers": ["<tool="]
    }
  }'

callstack before the fix

[2025-11-19 22:30:29] INFO:     127.0.0.1:59892 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[2025-11-19 22:30:29 TP4] Scheduler hit an exception: Traceback (most recent call last):
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 2736, in run_scheduler_process
    scheduler.event_loop_overlap()
  File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1008, in event_loop_overlap
    batch = self.get_next_batch_to_run()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1709, in get_next_batch_to_run
    new_batch = self.get_new_batch_prefill()
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1749, in get_new_batch_prefill
    self.move_ready_grammar_requests()
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 2167, in move_ready_grammar_requests
    req.grammar = req.grammar.result(timeout=0.03)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/lib/python3.12/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sgl-workspace/sglang/python/sglang/srt/constrained/reasoner_grammar_backend.py", line 89, in _init_value_dispatch
    ret = self.grammar_backend._init_value_dispatch(key)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sgl-workspace/sglang/python/sglang/srt/constrained/base_grammar_backend.py", line 163, in _init_value_dispatch
    grammar = self.dispatch_structural_tag(key_string)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sgl-workspace/sglang/python/sglang/srt/constrained/xgrammar_backend.py", line 256, in dispatch_structural_tag
    ctx = self.grammar_compiler.compile_structural_tag(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/xgrammar/compiler.py", line 277, in compile_structural_tag
    structural_tag_str = _get_structural_tag_str_from_args(args, kwargs)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/xgrammar/grammar.py", line 119, in _get_structural_tag_str_from_args
    return StructuralTag.from_legacy_structural_tag(args[0], args[1]).model_dump_json(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/xgrammar/structural_tag.py", line 297, in from_legacy_structural_tag
    content=JSONSchemaFormat(
            ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/pydantic/main.py", line 250, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 2 validation errors for JSONSchemaFormat
json_schema.bool
  Input should be a valid boolean [type=bool_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.12/v/bool_type
json_schema.dict[str,any]
  Input should be a valid dictionary [type=dict_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.12/v/dict_type

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

gongwei-130 avatar Nov 22 '25 01:11 gongwei-130

[!WARNING] You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

gemini-code-assist[bot] avatar Nov 22 '25 01:11 gemini-code-assist[bot]

@gongwei-130 Thanks for the contribution! Avoiding malformed structural tags is definitely useful.

By definition, the structural tag does not allow a null schema as input. Shall we just reject such requests with an invalid structural tag?

Ubospica avatar Nov 22 '25 06:11 Ubospica