SwanLab
SwanLab copied to clipboard
[ADVICE] Does it support visualization of benchmark results of vllm and sglang frameworks?
We are currently testing the performance of large language models, involving benchmarks at different concurrency levels/qps, as follows:
============ Serving Benchmark Result ============
Backend: sglang
Traffic request rate: inf
Max reqeuest concurrency: 96
Successful requests: 512
Benchmark duration (s): 97.04
Total input tokens: 262397
Total generated tokens: 269938
Total generated tokens (retokenized): 269931
Request throughput (req/s): 5.28
Input token throughput (tok/s): 2703.94
Output token throughput (tok/s): 2781.65
Total token throughput (tok/s): 5485.58
Concurrency: 86.09
----------------End-to-End Latency----------------
Mean E2E Latency (ms): 16318.12
Median E2E Latency (ms): 16774.36
---------------Time to First Token----------------
Mean TTFT (ms): 283.11
Median TTFT (ms): 88.81
P99 TTFT (ms): 1838.68
---------------Inter-Token Latency----------------
Mean ITL (ms): 30.47
Median ITL (ms): 26.36
P95 ITL (ms): 67.61
P99 ITL (ms): 103.11
Max ITL (ms): 1660.56
==================================================
The startup command is as follows:
python3 -m sglang.bench_serving \
--backend sglang \
--num-prompts 512 \
--random-input-len 1024 \
--random-output-len 1024 \
--dataset-name random \
--max-concurrency 32 \
--dataset-path /work/data/ShareGPT_V3_unfiltered_cleaned_split.json \
--seed 42 \
--host 127.0.0.1 \
--port 30033
How to achieve swanlab page visualization results?
Hey, we will pay attention to how the testing of SGLang can be visualized, and we will reply in this issue if there are any updates.