lmdeploy
lmdeploy copied to clipboard
Launch multiple api servers for dp > 1
trafficstars
Motivation
Support launch multiple api servers for dp > 1
TODO:
- [x] Update after https://github.com/InternLM/lmdeploy/pull/3523
Usage
Example for two nodes with ep=16, dp=16
Step 1: Launch proxy server on master node
lmdeploy serve proxy --server-port 23333 --server-name 172.16.4.52
Step 2: Launch api servers on master node
LMDEPLOY_DP_MASTER_ADDR=172.16.4.52 \
LMDEPLOY_DP_MASTER_PORT=29555 \
lmdeploy serve api_server \
deepseek-ai/DeepSeek-V3 \
--backend pytorch \
--ep 16 \
--dp 16 \
--proxy-url http://172.16.4.52:23333 \
--nnodes 2 \
--node-rank 0
Step 3: Launch api servers on slave node
LMDEPLOY_DP_MASTER_ADDR=172.16.4.52 \
LMDEPLOY_DP_MASTER_PORT=29555 \
lmdeploy serve api_server \
deepseek-ai/DeepSeek-V3 \
--backend pytorch \
--ep 16 \
--dp 16 \
--proxy-url http://172.16.4.52:23333 \
--nnodes 2 \
--node-rank 1
step 4 Query on proxy server
curl http://172.16.4.52:23333/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-ai/DeepSeek-V3",
"messages": [{"role": "user", "content": "Hello! How are you?"}]
}'
Modification
Please briefly describe what modification is made in this PR.
BC-breaking (Optional)
None
Use cases (Optional)
If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.
Checklist
- Pre-commit or other linting tools are used to fix the potential lint issues.
- The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
- If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
- The documentation has been modified accordingly, like docstring or example tutorials.