sglang
sglang copied to clipboard
Extract generation_manager from tokenizer_manager
Motivation
A large portion of TokenizerManager (or indeed Orchestrator, as is renamed in #3116) is related to generation, and the generation logic is somehow isolated from the other parts. Thus this is extracted.
Moreover, by doing this refactor, adding SPMD logic would be a bit easier and cleaner.
Modifications
Checklist
- [ ] Format your code according to the Code Formatting with Pre-Commit.
- [ ] Add unit tests as outlined in the Running Unit Tests.
- [ ] Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
- [ ] Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling.
All these PRs are waiting for CI (p.s. I have run test_fragment and torchrun script on non-CI and it works)
Thanks for the elaboration and efforts in addressing my questions. This PR LGTM now.