Jiaxin Shan
Jiaxin Shan
### 🚀 Feature Description and Motivation Currently, many controller configuration values (e.g., resources, concurrency limits, resync periods, queue lengths) are set based on heuristics or "magic numbers" without empirical justification....
### 🐛 Describe the bug   ### Steps to Reproduce 8 vllm instances, 960 GB Infinistore single instance ### Expected behavior it should work as expected ### Environment -...
### 🐛 Describe the bug  ``` .PHONY: docker-build-all docker-build-all: make -j $(nproc) docker-build-controller-manager docker-build-gateway-plugins docker-build-runtime docker-build-metadata-service docker-build-kvcache-watcher ## Build all docker images ``` I cleaned up everything and rebuild...
### 🚀 Feature Description and Motivation We are using two steps deployment in the past and `apply` can not be used due to long crds contents. ``` # Install nightly...
### 🐛 Describe the bug  ``` ERROR: failed to solve: failed commit on ref "layer-sha256:209998ede32af8f5bcd2d0b9d1d2ca17a41dcce7a0a9e2437aec0e8f557b323d": "layer-sha256:209998ede32af8f5bcd2d0b9d1d2ca17a41dcce7a0a9e2437aec0e8f557b323d" failed size validation: 1014 != 251: failed precondition ``` ### Steps to Reproduce...
### 🚀 Feature Description and Motivation This test should focus more on using openai python sdk to test different model endpoints and check compatible responses. if any gateway changes break...
### 🚀 Feature Description and Motivation We've seen many issues related to benchmark - https://github.com/vllm-project/aibrix/issues/1040 - https://github.com/vllm-project/aibrix/issues/1029 - https://github.com/vllm-project/aibrix/issues/1028 Can we integrate the code to CI system to make sure...
### 🐛 Describe the bug We have an infinistore patch version to accept gid, this is now just released in test PYPI. I meet some issue to install it ...
### 🐛 Describe the bug ``` ./benchmark.sh all + CONFIG_FILE=config/base.sh + [[ -f config/base.sh ]] + echo '[INFO] Loading configuration from config/base.sh' [INFO] Loading configuration from config/base.sh + source config/base.sh...
### Proposal to improve performance I try to reproduce the P&D 1P1D benchmark to compare performance with chunked prefill https://github.com/vllm-project/vllm/blob/main/benchmarks/disagg_benchmarks/disagg_performance_benchmark.sh. TTFL is higher than what I expected. Because the overhead...