DeepEP icon indicating copy to clipboard operation
DeepEP copied to clipboard

question about the modification of "pybind11::gil_scoped_release release"

Open leehongming opened this issue 8 months ago • 2 comments

Hi, A few days ago, the modification of "pybind11::gil_scoped_release release;" is added to Buffer::internode_dispatch, and you guys give some comments on it. In my understanding, the modification is supposed to optimize the performance of the inference system as the other threads such as kv cache is only stucked for a moment, not forever. Is there such a possibility that without the modification of gil release, a dispatch timeout is going to happen because of deadlock? Have you ever encountered this kind of problem?

leehongming avatar May 13 '25 07:05 leehongming

Is there such a possibility that without the modification of gil release, a dispatch timeout is going to happen because of deadlock?

With GIL consideration, timeout only occurs with some ranks launched comm kernels, while some ranks not. I guess it really depends on how you implement your framework. I can not tell an exact answer about this.

Have you ever encountered this kind of problem?

No, internally at DeepSeek.

This PR is introduced by the SGLang team, @fzyzcjy do you have something to comment?

LyricZhao avatar May 14 '25 09:05 LyricZhao

I have no more to comment

fzyzcjy avatar May 14 '25 10:05 fzyzcjy