Random Fly
Random Fly
Could you explain the benefit of doing so? It seems that with this change, the scheduler can no longer make decisions based on the number of sequences within a `SequenceGroup`.
This modification makes the "fork" mechanism of vLLM completely unused. Previously, for a request with n > 1, its prompt was prefilled only once, and then the sequence was "forked"...
> > This modification makes the "fork" mechanism of vLLM completely unused. Previously, for a request with n > 1, its prompt was prefilled only once, and then the sequence...