Simon Mo
Simon Mo
Any error message or repro?
Hi @cmaureir, I would like to inquire the current total usage of vLLM packages and whether we can increase the project limit of 10GB. We have made quite some progress...
Simon - [ ] Simon enable docker network in a fork using docker `campnet_clipper`
I made a pass. I think once this PR adds unit test for both the Triton and PagedAttention kernels it should be good to go. You might also need to...
I have tested the PR locally as well.
Hi @dtrifiro, here's the problem I ran into when releasing v0.6.2 yesterday: * The commit was https://github.com/vllm-project/vllm/commit/7193774b1ff8603ad5bf4598e5efba0d9a39b436, which is tagged with v0.6.2. * However, the buildkite job that is supposed...
Hmmm this build still produced a dev version. https://buildkite.com/vllm/release/builds/1373#01928cb1-0918-4cf1-862e-f708c516b203
and my local build using ` DOCKER_BUILDKIT=1 docker build . --network=host --target vllm-openai --tag vllm/vllm-openai --build-arg max_jobs=32 --build-arg RUN_WHEEL_CHECK=false --build-arg USE_SCCACHE=1 --build-arg SCCACHE_S3_NO_CREDENTIALS=1` produced `v0.6.4.dev....`
Indeed the wheel is published manually by me.
I believe we should download the model each time. @robertgshaw2-neuralmagic mentioned that putting them on NFS is a bit tricky because it might reaches rate limit.