Cyrus Leung

Results 137 comments of Cyrus Leung

Please also check whether this is consistent with the behaviour of the official OpenAI API.

> Hey, could you point me to the relevant tests? You can check the logs of the failing tests in CI. > Which behavior would you like to check? Whether...

> > Whether openai.types.CompletionUsage.completion_tokens == openai.types.CompletionUsage.prompt_tokens in general, which would be the case if BOS token is not added by default. > > Not sure I fully understand. Why should...

Have you tried using HTTPS instead of HTTP according to the error message? ```diff - openai_api_base = "http://192.168.2.6:8000/v1" + openai_api_base = "https://192.168.2.6:8000/v1" ```

Assuming that you're accessing a private/gated model via your HuggingFace account, you can set `HF_TOKEN` or `HF_TOKEN_PATH` (using the token associated with your account) as described [here](https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables).

I was just triaging the issues. I'm not that involved with the use of Ray in vLLM so I won't be of much assistance here.

We have added [documentation](https://docs.vllm.ai/en/stable/getting_started/debugging.html) for this situation in #5430. Please take a look.

> Thank you for the great work and I left a few comments! I've also reviewed your other PR #4910, so let's get that merged first then come back to...

Before merging this PR, imo we should complete #4328 first as it simplifies the API of passing multi-modal data. This would give me an opportunity to streamline the example in...

I have merged #4328 into this PR in advance, so you might see additional diffs. These diffs should disappear once #4328 is merged into `main`.