Sam Stoelinga

Results 223 comments of Sam Stoelinga

Storing models on a PVC is now supported with vLLM. Please update your helm chart to v0.10.0 or later to try it out. Other engines may happen later. Keeping this...

I updated the existing test and renamed the test so it's being run correctly now.

It's still relevant. @muyangyuapple encountered the same issue on pathways. But feel free to discard this and make a separate PR. tl;dr we should always true or false instead of...

I am getting the following error: ``` ERROR 09-28 19:27:59 async_llm_engine.py:61] RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16BF, lda, b, CUDA_R_16BF,...

Here is an example pod scraping metric that you can use with Google Managed Prometheus: ``` apiVersion: monitoring.googleapis.com/v1 kind: PodMonitoring metadata: name: vllm-pods spec: selector: matchLabels: app.kubernetes.io/name: vllm endpoints: -...

We should include the following grafana dashboard: https://github.com/vllm-project/vllm/tree/main/examples/production_monitoring

@kelvin-zou @hanzhi713 would appreciate your review to make sure this PR roughly matches 405B. Thank you!

Getting this error: ``` NotFoundError: The specified path gs://axlearn-public/tensorflow_datasets/tokenizers/sentencepiece/bpe _128k_c4.model was not found. ``` am I doing something wrong or is there a missing tokenizer? ``` gsutil ls -r -l...

Fixed the issue after vocab model was uploaded. Now I'm hitting OOM issues. Here is the model config: ``` max_step: 3932160 mesh_axis_names[0]: 'pipeline' mesh_axis_names[1]: 'data' mesh_axis_names[2]: 'expert' mesh_axis_names[3]: 'fsdp' mesh_axis_names[4]:...