ragflow [bug]: Unable to Connect to VLLM Model from Ragflow Interface Despite LAN Accessibility

Self Checks

[x] I have searched for existing issues search for existing issues, including closed ones.
[x] I confirm that I am using English to submit this report (Language Policy).
[x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
[x] Please do not modify this template :) and fill in all the required fields.

Describe your problem

I can access the VLLM model deployed on another local server (within the same LAN) from inside the Ragflow container. However, I am unable to add it successfully via the interface. Fail to access model (Qwen2.5-32B-Instruct-AWQ). ERROR: Connection error.

Mar 11 '25 09:03 ccivm

Weired! Don't make sense!

Mar 12 '25 03:03 KevinHuSh

I have the same problem. In my wsl where i lauch my ragflow, i can get vllm model via http://localhost:8100(for example).However when i try to add a vllm model in ragflow UI , i got same error message. Now do you have solutions?

Mar 12 '25 08:03 ajiankexx

我有同样的问题。在我启动 ragflow 的 wsl 中，我可以通过[http://localhost:8100（](http://localhost:8100(for)例如）获取 vllm 模型。但是当我尝试在 ragflow UI 中添加 vllm 模型时，我收到了相同的错误消息。现在你有解决方案吗？

I haven't solved it yet.

Mar 12 '25 08:03 ccivm

同样的问题，在v0.17.0就可以，升级之后就不行了

Mar 12 '25 08:03 Syorst

I found the root of the problem:

When using vLLM, it calls the chat method under OpenAI_APIChat, which inherits from the Base class. The chat method uses the OpenAI client to call the vLLM server's interface. However, the final request sent is missing the /v1 path. No matter whether I append /v1 to the base_url or not, it still doesn’t work. I couldn’t figure out why.

        timeout = int(os.environ.get('LM_TIMEOUT_SECONDS', 600))
        self.client = OpenAI(api_key=key, base_url=base_url, timeout=timeout)
        self.model_name = model_name

As a workaround, I wrote my own method to construct a URL request that includes /v1, and it worked. However, I don’t think this is an ideal solution.

Mar 12 '25 09:03 alexmao1024

same problems when I use 0.17.1

Mar 12 '25 16:03 yonggui521

not only lan, but also same machine since 0.17.0. cant add any vllm model. errors are the same: Failed to access balabala model(balabala) @ccivm plz change this issue to Bug

Mar 13 '25 00:03 l0rraine

I found the root of the problem:

When using vLLM, it calls the chat method under OpenAI_APIChat, which inherits from the Base class. The chat method uses the OpenAI client to call the vLLM server's interface. However, the final request sent is missing the /v1 path. No matter whether I append /v1 to the base_url or not, it still doesn’t work. I couldn’t figure out why.
    timeout = int(os.environ.get('LM_TIMEOUT_SECONDS', 600))
    self.client = OpenAI(api_key=key, base_url=base_url, timeout=timeout)
    self.model_name = model_name
As a workaround, I wrote my own method to construct a URL request that includes /v1, and it worked. However, I don’t think this is an ideal solution.

Why the /v1 will be dropped anyway? I don't understand?

Mar 13 '25 04:03 KevinHuSh

I found the root of the problem: When using vLLM, it calls the chat method under OpenAI_APIChat, which inherits from the Base class. The chat method uses the OpenAI client to call the vLLM server's interface. However, the final request sent is missing the /v1 path. No matter whether I append /v1 to the base_url or not, it still doesn’t work. I couldn’t figure out why.
    timeout = int(os.environ.get('LM_TIMEOUT_SECONDS', 600))
    self.client = OpenAI(api_key=key, base_url=base_url, timeout=timeout)
    self.model_name = model_name
As a workaround, I wrote my own method to construct a URL request that includes /v1, and it worked. However, I don’t think this is an ideal solution.
Why the /v1 will be dropped anyway? I don't understand?

I apologize for the confusion earlier. I hadn't set the log level to DEBUG, and upon a cursory inspection, I thought the issue was with the path. Actually, if /v1 is appended when adding the base_url, there would be no problem with the request path. The real issue lies with the api_key. The log message indicates the following:

2025-03-13 14:31:26,992 DEBUG    22 0 retries left
2025-03-13 14:31:26,996 INFO     22 Retrying request to /chat/completions in 1.873265 seconds
2025-03-13 14:31:28,872 DEBUG    22 Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'user', 'content': 'Hello! How are you doing!'}], 'model': 'Qwen/Qwen2.5-VL-7B-Instruct', 'temperature': 0.9}}
2025-03-13 14:31:28,875 DEBUG    22 Sending HTTP Request: POST http://192.168.1.145:8000/v1/chat/completions
2025-03-13 14:31:28,876 DEBUG    22 connect_tcp.started host='192.168.1.145' port=8000 local_address=None timeout=600 socket_options=None      
2025-03-13 14:31:28,881 DEBUG    22 connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7f7451c06fe0>
2025-03-13 14:31:28,882 DEBUG    22 send_request_headers.started request=<Request [b'POST']>
2025-03-13 14:31:28,883 DEBUG    22 send_request_headers.failed exception=LocalProtocolError(LocalProtocolError("Illegal header value b'Bearer '"))

Therefore, it was due to an empty api_key being passed, leading to an "Illegal header value b'Bearer '" error. I deleted all my modified code and used the original source code. After inputting 'no-key-need' as the api_key, I was able to add the model successfully. So, in the current version, an api_key is still required, even for local models. This placeholder might mislead users.

Mar 13 '25 06:03 alexmao1024

我发现问题的根源：使用 vLLM 时，会调用继承自 Base 类的 OpenAI_APIChat 下的 chat 方法。chat 方法使用 OpenAI 客户端调用 vLLM 服务器的接口。但最终发送的请求缺少 /v1 路径。无论我是否将 /v1 附加到 base_url，它都不起作用。我不明白为什么。
    timeout = int(os.environ.get('LM_TIMEOUT_SECONDS', 600))
    self.client = OpenAI(api_key=key, base_url=base_url, timeout=timeout)
    self.model_name = model_name
作为一种解决方法，我编写了自己的方法来构建包含 /v1 的 URL 请求，并且成功了。但是，我认为这不是一个理想的解决方案。
为什么/v1要放弃这个遗嘱？我不明白？
我为之前的混乱道歉。我没有将日志级别设置为 DEBUG，粗略检查后，我以为问题出在路径上。实际上，如果/v1在添加时附加base_url，则请求路径不会有问题。真正的问题在于api_key。日志消息表明以下内容：
2025-03-13 14:31:26,992 DEBUG    22 0 retries left
2025-03-13 14:31:26,996 INFO     22 Retrying request to /chat/completions in 1.873265 seconds
2025-03-13 14:31:28,872 DEBUG    22 Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'user', 'content': 'Hello! How are you doing!'}], 'model': 'Qwen/Qwen2.5-VL-7B-Instruct', 'temperature': 0.9}}
2025-03-13 14:31:28,875 DEBUG    22 Sending HTTP Request: POST http://192.168.1.145:8000/v1/chat/completions
2025-03-13 14:31:28,876 DEBUG    22 connect_tcp.started host='192.168.1.145' port=8000 local_address=None timeout=600 socket_options=None      
2025-03-13 14:31:28,881 DEBUG    22 connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7f7451c06fe0>
2025-03-13 14:31:28,882 DEBUG    22 send_request_headers.started request=<Request [b'POST']>
2025-03-13 14:31:28,883 DEBUG    22 send_request_headers.failed exception=LocalProtocolError(LocalProtocolError("Illegal header value b'Bearer '"))
因此，这是由于api_key传递了一个空值，导致出现“非法标头值 b'Bearer '”错误。我删除了所有修改过的代码并使用了原始源代码。在输入 'no-key-need' 作为后api_key，我能够成功添加模型。因此，在当前版本中，api_key即使是本地模型，仍然是必需的。这个占位符可能会误导用户。

thank you so much!

Mar 13 '25 06:03 sylaryrf

I found the root of the problem: When using vLLM, it calls the chat method under OpenAI_APIChat, which inherits from the Base class. The chat method uses the OpenAI client to call the vLLM server's interface. However, the final request sent is missing the /v1 path. No matter whether I append /v1 to the base_url or not, it still doesn’t work. I couldn’t figure out why.
    timeout = int(os.environ.get('LM_TIMEOUT_SECONDS', 600))
    self.client = OpenAI(api_key=key, base_url=base_url, timeout=timeout)
    self.model_name = model_name
As a workaround, I wrote my own method to construct a URL request that includes /v1, and it worked. However, I don’t think this is an ideal solution.
Why the /v1 will be dropped anyway? I don't understand?
I apologize for the confusion earlier. I hadn't set the log level to DEBUG, and upon a cursory inspection, I thought the issue was with the path. Actually, if /v1 is appended when adding the base_url, there would be no problem with the request path. The real issue lies with the api_key. The log message indicates the following:
2025-03-13 14:31:26,992 DEBUG    22 0 retries left
2025-03-13 14:31:26,996 INFO     22 Retrying request to /chat/completions in 1.873265 seconds
2025-03-13 14:31:28,872 DEBUG    22 Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'user', 'content': 'Hello! How are you doing!'}], 'model': 'Qwen/Qwen2.5-VL-7B-Instruct', 'temperature': 0.9}}
2025-03-13 14:31:28,875 DEBUG    22 Sending HTTP Request: POST http://192.168.1.145:8000/v1/chat/completions
2025-03-13 14:31:28,876 DEBUG    22 connect_tcp.started host='192.168.1.145' port=8000 local_address=None timeout=600 socket_options=None      
2025-03-13 14:31:28,881 DEBUG    22 connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7f7451c06fe0>
2025-03-13 14:31:28,882 DEBUG    22 send_request_headers.started request=<Request [b'POST']>
2025-03-13 14:31:28,883 DEBUG    22 send_request_headers.failed exception=LocalProtocolError(LocalProtocolError("Illegal header value b'Bearer '"))
Therefore, it was due to an empty api_key being passed, leading to an "Illegal header value b'Bearer '" error. I deleted all my modified code and used the original source code. After inputting 'no-key-need' as the api_key, I was able to add the model successfully. So, in the current version, an api_key is still required, even for local models. This placeholder might mislead users.

Hi @sylaryrf

I am using the latest version v0.17.2 full, the model and VLLM API have been tested with Open WebAI portal, which is working well for me, just like what you mentioned, input API Key Like "no-key-need", still could reproduce the issue, any idea?

Apr 21 '25 01:04 huanjie

So many discussion for vllm connection in RAGFlow, any solution for this issue?

I encounter the same issue in V0.21.1.

Nov 11 '25 12:11 Bob123Yang

关于 RAGFlow 中 vllm 连接的讨论很多，这个问题有解决方案吗？

我在 V0.21.1 版本中也遇到了同样的问题。

Try giving an invalid password placeholder of any value.

Nov 11 '25 12:11 ccivm

关于 RAGFlow 中 vllm 连接的讨论很多，这个问题有解决方案吗？我在 V0.21.1 版本中也遇到了同样的问题。

Try giving an invalid password placeholder of any value.

For which field in the setting page - API-Key as below red circle? I have give "no-api-need" to the field of API-Key, but still failed to connect the LLM on vllm.

Nov 11 '25 16:11 Bob123Yang

@Bob123Yang I currently don’t have the hardware required to run vLLM. Regarding the API-key field: if you are deploying a local model, you should leave it blank.

Nov 18 '25 03:11 Magicbook1108

@Bob123Yang I currently don’t have the hardware required to run vLLM. Regarding the API-key field: if you are deploying a local model, you should leave it blank.

I found a strange thing that if you deploy ragflow and vllm on a same machine, you can't use localhost or 127.0.0.1 as the BaseURL. Instead, you should directly use the IP of the server, then everything will be ok.

Dec 05 '25 15:12 caolonghao

Since there has been no further activity for over three weeks, we will proceed to close this issue. If the problem persists or you have additional questions, please feel free to reopen the issue or create a new one. We’re happy to assist anytime.

Dec 12 '25 02:12 Magicbook1108