lightllm
lightllm copied to clipboard
feat: Support decode chunk PD serving mode
- Add a new arg
pd_chunk_sizeto decide the chunk size. 0 means no chunk - Support decode chunk
I will remove the unnecessary files and changes later if the code review is all right.
The core changes is in the following two files:
lightllm/server/api_cli.pylightllm/server/httpserver_for_pd_master/manager.py