sglang `RecursionError: maximum recursion depth exceeded while calling a Python object` when inferencing with long input

trafficstars

Hi, I ran across this issue during inference

Exception in ModelRpcClient:
Traceback (most recent call last):
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 168, in exposed_step
    self.forward_step()
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 195, in forward_step
    self.forward_decode_batch(self.running_batch)
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 460, in forward_decode_batch
    self.handle_finished_requests(batch)
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 528, in handle_finished_requests
    prefix_len = self.tree_cache.insert(
                 ^^^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 61, in insert
    return self._insert_helper(self.root_node, key, value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 157, in _insert_helper
    return prefix_len + self._insert_helper(child, key, value)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 157, in _insert_helper
    return prefix_len + self._insert_helper(child, key, value)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 157, in _insert_helper
    return prefix_len + self._insert_helper(child, key, value)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  [Previous line repeated 958 more times]
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 166, in _insert_helper
    new_node = TreeNode()
               ^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 12, in __init__
    self.children = defaultdict(TreeNode)
                    ^^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded while calling a Python object

Would it be possible to implement this logic without recursion? @merrymercy

Feb 06 '24 18:02 Ja1Zhou

@Ja1Zhou Of course, this logic can be implemented without recursion.

I am unsure whether there would be so many nodes in a single path in the radix tree; it's very strange to recursive near 1k times. Would please help to check if this is a dead recursion bug or provide more information about how to reproduce it?

Feb 07 '24 16:02 hnyls2002

Hi. I myself am unable to produce the same error consistently 😭. In fact I am prompted with three kinds of errors randomly.

Exception in ModelRpcClient:
Traceback (most recent call last):
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 168, in exposed_step
    self.forward_step()
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 179, in forward_step
    new_batch = self.get_new_fill_batch()
                ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 293, in get_new_fill_
batch
    self.token_to_kv_pool.available_size() + self.tree_cache.evictable_size()
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/memory_pool.py", line 92, in available_size
    return torch.sum(self.mem_state == 0).item()
                     ^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.


Exception in ModelRpcClient:
Traceback (most recent call last):
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 168, in exposed_step
    self.forward_step()
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 179, in forward_step
    new_batch = self.get_new_fill_batch()
                ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 277, in get_new_fill_batch
    prefix_indices, last_node = self.tree_cache.match_prefix(req.input_ids)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 52, in match_prefix
    value = torch.concat(value)
            ^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.


Exception in ModelRpcClient:
Traceback (most recent call last):
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 168, in exposed_step
    self.forward_step()
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 195, in forward_step
    self.forward_decode_batch(self.running_batch)
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/model_rpc.py", line 421, in forward_decode_batch
    if not batch.check_decode_mem():
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/infer_batch.py", line 284, in check_decode_mem
    self.tree_cache.evict(bs, self.token_to_kv_pool.free)
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 74, in evict
    leaves = self._collect_leaves()
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 201, in _collect_leaves
    dfs_(self.root_node)
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 199, in dfs_
    dfs_(x)
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 199, in dfs_
    dfs_(x)
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 199, in dfs_
    dfs_(x)
  [Previous line repeated 959 more times]
  File "/User/jay/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/router/radix_cache.py", line 198, in dfs_
    for x in cur_node.children.values():
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded while calling a Python object

Also I am using a proprietary model and perhaps it would be hard to reproduce my error.

I would really appreciate it if there is any insight as to why these errors would appear!

Feb 09 '24 06:02 Ja1Zhou

@Ja1Zhou Any scripts to reproduce it would help us debug. Otherwise, it is very difficult to debug with only these error messages.

Some tips:

Try to disable tensor parallelism
Use this function to print the tree https://github.com/sgl-project/sglang/blob/c51020cf0c64498865538362aa34baaed13a3b50/python/sglang/srt/managers/router/radix_cache.py#L63 . Can you provide us with the status of the tree when you see the error message?

Feb 11 '24 14:02 merrymercy

I ran across the same error but at different place.

Exception in ModelRpcClient:                                                                                                                                                                                                                                                  Traceback (most recent call last):
  File "/home/yangchunhao/miniconda3/envs/sglang/lib/python3.10/site-packages/sglang/srt/managers/router/model_rpc.py", line 176, in exposed_step                                                                                                                                 self.forward_step()
  File "/home/yangchunhao/miniconda3/envs/sglang/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context                                                                                                                                          return func(*args, **kwargs)
  File "/home/yangchunhao/miniconda3/envs/sglang/lib/python3.10/site-packages/sglang/srt/managers/router/model_rpc.py", line 187, in forward_step                                                                                                                                 new_batch = self.get_new_fill_batch()
  File "/home/yangchunhao/miniconda3/envs/sglang/lib/python3.10/site-packages/sglang/srt/managers/router/model_rpc.py", line 285, in get_new_fill_batch                                                                                                                           prefix_indices, last_node = self.tree_cache.match_prefix(req.input_ids)
  File "/home/yangchunhao/miniconda3/envs/sglang/lib/python3.10/site-packages/sglang/srt/managers/router/radix_cache.py", line 50, in match_prefix                                                                                                                                self._match_prefix_helper(self.root_node, key, value, last_node)
  File "/home/yangchunhao/miniconda3/envs/sglang/lib/python3.10/site-packages/sglang/srt/managers/router/radix_cache.py", line 129, in _match_prefix_helper                                                                                                                       self._match_prefix_helper(child, key[prefix_len:], value, last_node)
  File "/home/yangchunhao/miniconda3/envs/sglang/lib/python3.10/site-packages/sglang/srt/managers/router/radix_cache.py", line 129, in _match_prefix_helper                                                                                                                       self._match_prefix_helper(child, key[prefix_len:], value, last_node)
  File "/home/yangchunhao/miniconda3/envs/sglang/lib/python3.10/site-packages/sglang/srt/managers/router/radix_cache.py", line 129, in _match_prefix_helper                                                                                                                       self._match_prefix_helper(child, key[prefix_len:], value, last_node)
  [Previous line repeated 979 more times]
  File "/home/yangchunhao/miniconda3/envs/sglang/lib/python3.10/site-packages/sglang/srt/managers/router/radix_cache.py", line 120, in _match_prefix_helper                                                                                                                       prefix_len = match(c_key, key)
  File "/home/yangchunhao/miniconda3/envs/sglang/lib/python3.10/site-packages/sglang/srt/managers/router/radix_cache.py", line 24, in match                                                                                                                                       for k, w in zip(key, seq):
RecursionError: maximum recursion depth exceeded while calling a Python object

My environment: 4090 * 2 SGLang 0.1.12 vLLM 0.3.1 Qwen1.5-14B

My script

"""
Usage:
python -m sglang.launch_server --model-path meta-llama/Llama-2-7b-chat-hf --port 30000
python json_decode.py
"""
from enum import Enum
from typing import List, Union

import sglang as sgl
from pydantic import BaseModel
from sglang.srt.constrained import build_regex_from_object

character_regex = r"""\[[\n ]*((\{[\n ]*"债券简称"[\n ]*:[\n ]*"(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"[\n ]*,[\n ]*"债券代码"[\n ]*:[\n ]*"(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"[\n ]*,[\n ]*"报价方向"[\n ]*:[\n ]*(\{[\n ]*"text"[\n ]*:[\n ]*"(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"[\n ]*,[\n ]*"choices"[\n ]*:[\n ]*("bid"|"ofr"|"double"|"unknown")[\n ]*\}|null)[\n ]*,[\n ]*"bid价格"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*,[\n ]*"bid数量"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*,[\n ]*"bid价格类型"[\n ]*:[\n ]*(\{[\n ]*"text"[\n ]*:[\n ]*"(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"[\n ]*,[\n ]*"choices"[\n ]*:[\n ]*("1\-净价"|"3\-收益率"|"4\-利差"|"5\-意向")[\n ]*\}|null)[\n ]*,[\n ]*"bid是否请示"[\n ]*:[\n ]*(\{[\n ]*"text"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*,[\n ]*"choices"[\n ]*:[\n ]*("是"|"否")[\n ]*\}|null)[\n ]*,[\n ]*"ofr价格"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*,[\n ]*"ofr数量"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*,[\n ]*"ofr价格类型"[\n ]*:[\n ]*(\{[\n ]*"text"[\n ]*:[\n ]*"(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"[\n ]*,[\n ]*"choices"[\n ]*:[\n ]*("1\-净价"|"3\-收益率"|"4\-利差"|"5\-意向")[\n ]*\}|null)[\n ]*,[\n ]*"ofr是否请示"[\n ]*:[\n ]*(\{[\n ]*"text"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*,[\n ]*"choices"[\n ]*:[\n ]*("是"|"否")[\n ]*\}|null)[\n ]*,[\n ]*"交易偏好描述"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*\})(,[\n ]*(\{[\n ]*"债券简称"[\n ]*:[\n ]*"(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"[\n ]*,[\n ]*"债券代码"[\n ]*:[\n ]*"(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"[\n ]*,[\n ]*"报价方向"[\n ]*:[\n ]*(\{[\n ]*"text"[\n ]*:[\n ]*"(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"[\n ]*,[\n ]*"choices"[\n ]*:[\n ]*("bid"|"ofr"|"double"|"unknown")[\n ]*\}|null)[\n ]*,[\n ]*"bid价格"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*,[\n ]*"bid数量"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*,[\n ]*"bid价格类型"[\n ]*:[\n ]*(\{[\n ]*"text"[\n ]*:[\n ]*"(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"[\n ]*,[\n ]*"choices"[\n ]*:[\n ]*("1\-净价"|"3\-收益率"|"4\-利差"|"5\-意向")[\n ]*\}|null)[\n ]*,[\n ]*"bid是否请示"[\n ]*:[\n ]*(\{[\n ]*"text"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*,[\n ]*"choices"[\n ]*:[\n ]*("是"|"否")[\n ]*\}|null)[\n ]*,[\n ]*"ofr价格"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*,[\n ]*"ofr数量"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*,[\n ]*"ofr价格类型"[\n ]*:[\n ]*(\{[\n ]*"text"[\n ]*:[\n ]*"(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"[\n ]*,[\n ]*"choices"[\n ]*:[\n ]*("1\-净价"|"3\-收益率"|"4\-利差"|"5\-意向")[\n ]*\}|null)[\n ]*,[\n ]*"ofr是否请示"[\n ]*:[\n ]*(\{[\n ]*"text"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*,[\n ]*"choices"[\n ]*:[\n ]*("是"|"否")[\n ]*\}|null)[\n ]*,[\n ]*"交易偏好描述"[\n ]*:[\n ]*("(?:[^"\\\x00-\x1f\x7f-\x9f]|\\.)*"|null)[\n ]*\})){0,})?[\n ]*\]"""

def driver_character_gen():
    state = character_gen.run(name="Hermione Granger")
    print(state.text())

class DirectionChoices(str, Enum):
    bid = "bid"
    ofr = "ofr"
    double = "double"
    unknown = "unknown"

class PriceChoices(str, Enum):
    one = "1-净价",
    three = "3-收益率",
    four = "4-利差",
    five = "5-意向",

class PriceType(BaseModel):
    text: str
    choices: PriceChoices

class TradeDirection(BaseModel):
    text: str
    choices: DirectionChoices

class RequestChoices(str, Enum):
    yes = "是",
    no = "否"

class RequestType(BaseModel):
    text: Union[str, None]
    choices: RequestChoices

class TradeFormat(BaseModel):
    债券简称: str
    债券代码: str
    报价方向: Union[TradeDirection, None]
    bid价格: Union[str, None]
    bid数量: Union[str, None]
    bid价格类型: Union[PriceType, None]
    bid是否请示: Union[RequestType, None]
    ofr价格: Union[str, None]
    ofr数量: Union[str, None]
    ofr价格类型: Union[PriceType, None]
    ofr是否请示: Union[RequestType, None]
    交易偏好描述: Union[str, None]

class TradeList(BaseModel):
    报价信息: List[TradeFormat]

@sgl.function
def pydantic_wizard_gen(s, question):
    ins = '你的任务是将一段有关债券交易的文本转换为特定的json格式。每行为一条债券交易记录，从中提取"债券简称", "债券代码", "报价方向", "bid价格", "bid数量", "bid价格类型", "bid是否请示", "ofr价格", "ofr数量", "ofr价格类型", "ofr是否请示", "交易偏好描述"并以json格式输出。下面是输入的文本：\n'
    s += ins + question
    s += sgl.gen(
        "json_output",
        max_tokens=4200,
        temperature=0,
        regex=character_regex,  # Requires pydantic >= 2.0
    )


def driver_pydantic_wizard_gen(question):
    state = pydantic_wizard_gen.run(question)
    print(state.text())


sgl.set_default_backend(sgl.RuntimeEndpoint("http://localhost:30000"))
input_str = '\n成都银行ofr：\n1Y+2Y    178760.SH    21悦来02    --/Ofr*    --/5000    AA+    估值:2.9058|3.6472    \n1.21Y+2Y    102101718    21新都香城MTN001    --/Ofr*    --/5000    AA/AA+    有担保    估值:3.2628|3.9477    \n2.6Y+2Y    114663.SH    23广控01    --/Ofr*    --/10000    AA+    估值:3.8265|4.5575    \n1.8Y(休1)    166500.SH    20九联01    --/Ofr*    --/20000    AA+    估值:3.639    \n2.17Y+2Y    102281770    22江津华信MTN001    --/Ofr*    --/3000    AA+    估值:3.5448|4.1318    \n2.21Y+2Y(休1)    182542.SH    22高新02    --/Ofr*    --/11000    AA+    估值:3.9224|4.7603    \n2.49Y+2Y    102282652    22九联投资MTN001    --/Ofr*    --/2000    AA+    估值:3.6805|4.2197    \n1.52Y+2Y+1Y    102282745    22乐山国资MTN001    --/Ofr*    --/4000    AA+    估值:3.1601|4.0744    \n1.8Y    032000313    20涪陵新城PPN001    --/Ofr*    --/20000    AA    估值:3.9839    \n2.22Y+2Y    182525.SH    22科建02    --/Ofr*    --/1000    AA+    有担保    估值:3.5236|4.3653    \n1.76Y+2Y    194068.SH    22三江01    --/Ofr*    --/5000    AA+    估值:3.4745|4.3257    \n2.39Y+2Y    114029.SH    22长经04    --/Ofr*    --/4000    AA+    估值:3.3985|3.9355    \n1.46Y+2Y    102103111    21重庆临空MTN002    --/Ofr*    --/3000    AA+    估值:3.1301|3.7854    \n1.51Y+3Y    2080402    20金牛环绿债01    --/Ofr*    --/3000    AA+    估值:2.8779|3.3195    \n2.07Y+2Y    194887.SH    22长经02    --/Ofr*    --/3000    AA+    估值:3.285|3.8521    \n1.59Y+2Y    102280064    22空港城发MTN001    --/Ofr*    --/10000    AA+    估值:3.1887|3.8335    \n2.59Y+2Y    102380056    23兴泸MTN001    --/Ofr*    --/1000    AA+    估值:3.231|3.5799    \n2.5Y+2Y(休1)    102282693    22金牛环境MTN002    --/Ofr*    --/10000    AA+    估值:3.2157|3.5678    \n1.6Y+1Y(休1)    102380111    23乐山国资MTN001    --/Ofr*    --/10000    AA+    估值:3.1961|3.5469    \n3.19Y+2Y(休2)    2180325    21空港债01    --/Ofr*    --/3000    AA+    估值:3.5979|3.9162    \n3Y    178754.SH    21悦来01    --/Ofr*    --/7000    AA+    估值:3.649    \n1.73Y+2Y    102280426    22香城投资MTN002    --/Ofr*    --/2000    AA+    估值:3.2022|3.8349    \n2.58Y+2Y(休1)    032380023    23湖北科投PPN001    --/Ofr*    --/12000    AAA    估值:3.4609|3.9769    \n1.76Y(休2)    194100.SH    22通经01    --/Ofr*    --/4000    AAA    估值:3.1615    \n2.03Y+2Y    032280566    22武侯资本PPN002    --/Ofr*    --/5000    AA+    估值:3.268|3.841    \n1.57Y+2Y    032191442    21西盛投资PPN001    --/Ofr*    --/5000    AA+    估值:3.8466|4.6768    \n2.08Y+2Y    032280626    22渝隆资产PPN001    --/Ofr*    --/6000    AA+    估值:3.5403|4.1044    \n1.91Y+2Y    102281079    22成华棚改MTN001    --/Ofr*    --/3000    AA+    估值:3.054|3.4763    \n1.43Y(休2)    032101056    21南京浦口PPN003    --/Ofr*    --/2000    AA+    估值:3.3397    \n\n'
driver_pydantic_wizard_gen(input_str)
# driver_pydantic_wizard_gen(input_str)

Error can be reproduced when driver_pydantic_wizard_gen() run twice.

Mar 03 '24 09:03 DouHappy

@Ja1Zhou Replacing the code in python/sglang/launch_server.py seems work for me. my env: sglang 0.1.12 torch 2.1.2+cu121 docker images nvcr.io/nvidia/pytorch/23.10-py3

import argparse
import sys

from sglang.srt.server import ServerArgs, launch_server

if __name__ == "__main__":
    sys.setrecursionlimit(8000)
    parser = argparse.ArgumentParser()
    ServerArgs.add_cli_args(parser)
    args = parser.parse_args()
    server_args = ServerArgs.from_cli_args(args)
    launch_server(server_args, None)

Mar 22 '24 04:03 DouHappy

Hey I am having the same error. How do I relaunch the local launch_server.py afer changing launch_server.py as @DouHappy mentioned?

Apr 10 '24 01:04 hayleyhu

This issue has been automatically closed due to inactivity. Please feel free to reopen it if needed.

Jul 25 '24 06:07 github-actions[bot]

I get this with the latest sglang:

object address  : 0x150df025af80
object refcount : 4
object type     : 0x151167c6a320
object type name: RecursionError
object repr     : RecursionError('maximum recursion depth exceeded')
lost sys.stderr
Error in sys.excepthook:
object address  : 0x15290c652da0
object refcount : 1
object type     : 0x152d74054320
object type name: RecursionError
object repr     : RecursionError('maximum recursion depth exceeded')
lost sys.stderr

Original exception was:
object address  : 0x152c28c30160
object refcount : 3
object type     : 0x152d74054320
object type name: RecursionError
object repr     : RecursionError('maximum recursion depth exceeded')
lost sys.stderr
Error in sys.excepthook:
object address  : 0x149b78692e60
object refcount : 1
object type     : 0x149fc5f6d320
object type name: RecursionError
object repr     : RecursionError('maximum recursion depth exceeded')
lost sys.stderr

Original exception was:
object address  : 0x149e7acac160
object refcount : 3
object type     : 0x149fc5f6d320
object type name: RecursionError
object repr     : RecursionError('maximum recursion depth exceeded')
lost sys.stderr
Error in sys.excepthook:
object address  : 0x15383ed22d40
object refcount : 1
object type     : 0x153cb2787320
object type name: RecursionError
object repr     : RecursionError('maximum recursion depth exceeded')
lost sys.stderr

Original exception was:
object address  : 0x153b67344160
object refcount : 3
object type     : 0x153cb2787320
object type name: RecursionError
object repr     : RecursionError('maximum recursion depth exceeded')
lost sys.stderr
2024-09-26 12:29:20 | ERROR | stderr | Traceback (most recent call last):
2024-09-26 12:29:20 | ERROR | stderr |   File "/p/project/ccstao/cstao05/FastChat/fastchat/serve/sglang_worker.py", line 290, in <module>
2024-09-26 12:29:20 | ERROR | stderr |     runtime = sgl.Runtime(
2024-09-26 12:29:20 | ERROR | stderr |               ^^^^^^^^^^^^
2024-09-26 12:29:20 | ERROR | stderr |   File "/p/project1/ccstao/cstao05/FastChat/sc_venv_jureca/venv/lib/python3.11/site-packages/sglang/api.py", line 40, in Runtime
2024-09-26 12:29:20 | ERROR | stderr |     return Runtime(*args, **kwargs)
2024-09-26 12:29:20 | ERROR | stderr |            ^^^^^^^^^^^^^^^^^^^^^^^^
2024-09-26 12:29:20 | ERROR | stderr |   File "/p/project1/ccstao/cstao05/FastChat/sc_venv_jureca/venv/lib/python3.11/site-packages/sglang/srt/server.py", line 553, in __init__
2024-09-26 12:29:20 | ERROR | stderr |     raise RuntimeError(
2024-09-26 12:29:20 | ERROR | stderr | RuntimeError: Initialization failed. Please see the error messages above.
[rank3]:[W926 12:29:21.712860341 CudaIPCTypes.cpp:16] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
[rank1]:[W926 12:29:21.712912191 CudaIPCTypes.cpp:16] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
[rank2]:[W926 12:29:21.965858408 CudaIPCTypes.cpp:16] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
/p/software/jurecadc/stages/2024/software/Python/3.11.3-GCCcore-12.3.0/lib/python3.11/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
srun: error: jrc0911: task 0: Exited with exit code 1

This happens with all models I tested: Mistral-8x22, Phi-3.5 and Mistral-Mamba (the last two are not working on the model_worker or vllm of fastChat, so I tried sglang)

Sep 26 '24 10:09 surak

if __name__ == "__main__":
    sys.setrecursionlimit(8000)
    parser = argparse.ArgumentParser()
    ServerArgs.add_cli_args(parser)
    args = parser.parse_args()
    server_args = ServerArgs.from_cli_args(args)
    launch_server(server_args, None)

Tried, I still get the same recursion error, even if I set it to 100.

Jan 10 '25 17:01 surak

sglang sglang copied to clipboard

`RecursionError: maximum recursion depth exceeded while calling a Python object` when inferencing with long input

sglang
sglang copied to clipboard