question about the modification of "pybind11::gil_scoped_release release"
Hi, A few days ago, the modification of "pybind11::gil_scoped_release release;" is added to Buffer::internode_dispatch, and you guys give some comments on it. In my understanding, the modification is supposed to optimize the performance of the inference system as the other threads such as kv cache is only stucked for a moment, not forever. Is there such a possibility that without the modification of gil release, a dispatch timeout is going to happen because of deadlock? Have you ever encountered this kind of problem?
Is there such a possibility that without the modification of gil release, a dispatch timeout is going to happen because of deadlock?
With GIL consideration, timeout only occurs with some ranks launched comm kernels, while some ranks not. I guess it really depends on how you implement your framework. I can not tell an exact answer about this.
Have you ever encountered this kind of problem?
No, internally at DeepSeek.
This PR is introduced by the SGLang team, @fzyzcjy do you have something to comment?
I have no more to comment