DeepSpeed-MII
DeepSpeed-MII copied to clipboard
How to eliminate deadlock problem?
In my case, large number of concurrent requests are needed. I find one parameter which not used now is max_context in DSStateManagerConfig. What I want to ask is, can I modify this parameter to eliminate deadlock?
Hi @BaiStone2017, thank you for your question.
Unfortunately, max_context
won't help solving the issue. You can limit the number of sequences that the inference engine maintains by setting another parameter (max_ragged_sequence_count
). This will reduce the risk of the deadlock but the performance may decline significantly.