[Question]: Trying to run a local reranker using xinference
Describe your problem
When trying to configure a local rerank model via Xinference, I get the following error
Testing the connectivity and response from the model via curl from inside the ragflow-server container is working with the same config:
# curl -X POST http://host.docker.internal:9997/v1/rerank \
-H "Content-Type: application/json" \
-d '{
"model": "minicpm-reranker", # or replace with actual model_uid
"query": "A man is eating pasta.",
"documents": [
"A man is eating food.",
"A man is eating a piece of bread.",
"The girl is carrying a baby.",
"A woman is playing violin."
]
}'
{"id":"2b0ee576-871a-11ef-82ed-66ecc185683a","results":[{"index":3,"relevance_score":1.0,"document":null},{"index":0,"relevance_score":1.0,"document":null},{"index":2,"relevance_score":0.9999998807907104,"document":null},{"index":1,"relevance_score":0.03186270222067833,"document":null}],"meta":{"api_version":null,"billed_units":null,"tokens":null,"warnings":null}}#
How should I configure the local rerank model via xinference? What are the correct parameters? I am using the latest release branch (not dev)
I tried a number of base url:
http://host.docker.internal:9997/v1/rerank
http://host.docker.internal:9997/v1/
http://host.docker.internal:9997/
I get the same failed to access .results error for the 3 as shown in screenshot
Also tried with other embeddings like bge-reranker-base
What about this: http://127.17.0.1:9997/v1/rerank
What 127.17.0.1?
I get connection error with that suggested setting
hint : 102
Fail to access model(bge-reranker-base).HTTPConnectionPool(host='127.17.0.1', port=9997): Max retries exceeded with url: /v1/rerank/v1 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fff7b3b1250>: Failed to establish a new connection: [Errno 111] Connection refused'))
Can be verified from within the ragflow server container with curl:
----> with host.docker.internal:9997 ----> OK
curl -X POST http://host.docker.internal:9997/v1/rerank -H "Content-Type: application/json" -d '{"model": "bge-reranker-base", "query": "A man is eating pasta.", "documents": ["A man is eating food.","A man is eating a piece of bread.","The girl is carrying a baby.","A woman is playing violin."]}'
{"id":"008c2136-877b-11ef-9bde-66ecc185683a","results":[{"index":0,"relevance_score":0.9999247789382935,"document":null},{"index":1,"relevance_score":0.256493479013443,"document":null},{"index":2,"relevance_score":0.00003742107219295576,"document":null},{"index":3,"relevance_score":0.000037397916457848623,"document":null}],"meta":{"api_version":null,"billed_units":null,"tokens":null,"warnings":null}}#
#
---> with 127.17.0.1:9997 ----> Couldn't connect to server
# curl -X POST http://127.17.0.1:9997/v1/rerank -H "Content-Type: application/json" -d '{"model": "bge-reranker-base", "query": "A man is eating pasta.", "documents": ["A man is eating food.","A man is eating a piece of bread.","The girl is carrying a baby.","A woman is playing violin."]}'
curl: (7) Failed to connect to 127.17.0.1 port 9997 after 5 ms: Couldn't connect to server
This issue has been resolved on # 2758 and will be released soon. You can do it through http://IP address:9997/v1/rerank or http://IP address:9997 to use
If the Rerank Model is deployed on WSL, the IP address of WSL can be obtained through the hostname -I command. If the Rerank Model is deployed on Docker, the IP address of internal Docker should be used
Thanks for your response... but doesn't work with the IP either, get the same error:
hint : 102
Fail to access model(bge-reranker-base).'results'
although trying within the container:
root@843dde0b1a68:/ragflow# curl -X POST http://192.168.65.254:9997/v1/rerank -H "Content-Type: application/json" -d '{"model": "bge-reranker-base", "query": "A man is eating pasta.", "documents": ["A man is eating food.","A man is eating a piece of bread.","The girl is carrying a baby.","A woman is playing violin."]}'
{"id":"b1a8049a-878e-11ef-8832-66ecc185683a","results":[{"index":0,"relevance_score":0.9999247789382935,"document":null},{"index":1,"relevance_score":0.256493479013443,"document":null},{"index":2,"relevance_score":0.00003742107219295576,"document":null},{"index":3,"relevance_score":0.000037397916457848623,"document":null}],"meta":{"api_version":null,"billed_units":null,"tokens":null,"warnings":null}}
root@843dde0b1a68:/ragflow#
I am running xinference on the mac host OS and ragflow using docker compose
I don't believe the issue is the IP address, as both the IP address and host.docker.internal are giving the same error.
Also it is not a connection error, as if you enter a different random IP address that is not reachable, you get a connection error, not Fail to access model(bge-reranker-base).'results'
Just FYI, I also tried all combinations with and without rerank, with and without v1, with and without / at the end
#2758 You can check this PR to see if there have been updates to the code file: rag/lm/rank_madel. py
I applied the changes in that PR, exact same error.
I believe the issue is with the handling of results not the http address?
I will try the latest dev image to check if the error is there a well...
I got it. I deployed the Ragflow development environment locally and tested the bge-reranker-base model successfully. If you still encounter problems, please feel free to ask at any time.
Thank you for your effort...
Same error with the dev docker image...
hint : 102
Fail to access model(bge-reranker-base).'results'
Actually I just realised the rag/lm/rank_model.py file in the ragflow-server container does not contain the PR changes... I am surprised as I pull the latest dev container. Wasn't it released? How can I make that change on my setup?
What about pulling the images again?
Working now, I didn't realise that the image was not being pulled properly due to a host disk size issue
I met the same problem. I use the xinference to deploy bce-rerank-v1 model. And I check the service by the code is OKcurl -X 'POST' \ 'http://192.168.32.243:9997/v1/rerank' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "model": "bce-reranker-base_v1", "query": "A man is eating pasta.", "documents": [ "A man is eating food.", "A man is eating a piece of bread.", "The girl is carrying a baby.", "A man is riding a horse.", "A woman is playing violin." ] }' but the rerank settting in ragflow is not OK as below:
What is the response of the curl?
{"id":"096a45aa-d18e-11ef-9601-3cecefaed1da","results":[{"index":0,"relevance_score":0.5649465918540955,"document":null},{"index":1,"relevance_score":0.5300139784812927,"document":null},{"index":3,"relevance_score":0.4056761860847473,"document":null},{"index":4,"relevance_score":0.3438847064971924,"document":null},{"index":2,"relevance_score":0.3433420658111572,"document":null}],"meta":{"api_version":null,"billed_units":null,"tokens":null,"warnings":null}}%
So, there is a key named results.
Is this 192.168.32.243 accessible in the ragflow container?
So, there is a key named
results. Is this192.168.32.243accessible in the ragflow container?
Yes, Chat model and embeding model is available
mark the same error with the latest version