[Bug]: castorini/monot5-large-msmarco - Fail to access model
Is there an existing issue for the same bug?
- [x] I have checked the existing issues.
RAGFlow workspace code commit ID
8bb65d1ab5422ac76c539540f0e1118e8b1860a0
RAGFlow image version
v0.16.0 slim
Other environment information
Microsoft WSL 2
Distributor ID: Ubuntu
Description: Ubuntu 24.04.1 LTS
Release: 24.04
Codename: noble
Actual behavior
I inference castorini/monot5-large-msmarco model with TGI (TEI not work with T5, TEI work well for BAAI/bge-reranker-v2-m3). And it work.
(HF_TGI) dromeuf@MAIA:~$ sudo docker run --gpus all -p 33437:80 ghcr.io/huggingface/text-generation-inference:latest --model-id castorini/monot5-large-msmarco
dromeuf@MAIA:~$ curl -X POST http://localhost:33437/v1/completions -H "Content-Type: application/json" -d '{"prompt": "texte à reranker", "max_tokens": 50}'
{"object":"text_completion","id":"","created":1740666060,"model":"castorini/monot5-large-msmarco","system_fingerprint":"3.1.1-dev0-sha-5eec3a8","choices":[{"index":0,"text":"I read something false a hundred times.","logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":8,"completion_tokens":10,"total_tokens":18}}
But when I try to Add rerank in Web UI -> OpenAI-API-Compatible
` rerank , castorini/monot5-large-msmarco , http://host.docker.internal:33437 , "", 512
` I get ERROR :
hint : 102
Fail to access model(castorini/monot5-large-msmarco___OpenAI-API).Expecting value: line 1 column 1 (char 0)
2025-02-27 18:33:32,724 ERROR 588
Fail to access model(castorini/monot5-large-msmarco___OpenAI-API).Expecting value: line 1 column 1 (char 0)
NoneType: None
2025-02-27 18:33:32,725 INFO 588 172.18.0.3 - - [27/Feb/2025 18:33:32] "POST /v1/llm/add_llm HTTP/1.1" 200 -
2025-02-27 18:34:00,793 INFO 36 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2025-02-27T18:34:00.793+01:00", "boot_at": "2025-02-27T18:10:34.425+01:00", "pending": 0, "lag": 0, "done": 0, "failed": 0, "current": null}
Expected behavior
No response
Steps to reproduce
all
Additional information
No response
castorini/monot5-large-msmarco is not a re-rank model.
castorini/monot5-large-msmarcois not a re-rank model.
Unless I'm mistaken, Kevin, in https://huggingface.co/castorini/monot5-large-msmarco :
This model is a T5-large reranker fine-tuned on the MS MARCO passage dataset for 100k steps (or 10 epochs).
You mean it's not a good reranking model or it's not a good reranking model at all ?
(HF_TGI) dromeuf@MAIA:~$ sudo docker run --gpus all -p 33437:80 ghcr.io/huggingface/text-generation-inference:latest --model-id castorini/monot5-large-msmarco
dromeuf@MAIA:~$ curl -X POST http://localhost:33437/v1/completions -H "Content-Type: application/json" -d '{"prompt": "texte à reranker", "max_tokens": 50}'
{"object":"text_completion","id":"","created":1740666060,"model":"castorini/monot5-large-msmarco","system_fingerprint":"3.1.1-dev0-sha-5eec3a8","choices":[{"index":0,"text":"I read something false a hundred times.","logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":8,"completion_tokens":10,"total_tokens":18}}
From these, it's not a rerank model at all.
From these, it's not a rerank model at all.
ok so don't use it. Thanks.
From these, it's not a rerank model at all.
Kevin, can you recommend the best LOCAL (not cloud) reranker to run with RAGFlow ?
Kind regards,
(HF_TGI) dromeuf@MAIA:~$ sudo docker run --gpus all -p 33437:80 ghcr.io/huggingface/text-generation-inference:latest --model-id castorini/monot5-large-msmarco dromeuf@MAIA:~$ curl -X POST http://localhost:33437/v1/completions -H "Content-Type: application/json" -d '{"prompt": "texte à reranker", "max_tokens": 50}' {"object":"text_completion","id":"","created":1740666060,"model":"castorini/monot5-large-msmarco","system_fingerprint":"3.1.1-dev0-sha-5eec3a8","choices":[{"index":0,"text":"I read something false a hundred times.","logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":8,"completion_tokens":10,"total_tokens":18}}From these, it's not a rerank model at all.
oh excuse me Kevin, I was concentrating on my docker command line and not on the curl test which is obviously no good for testing a reranker. I got the wrong curl test line.
With this command I TEI inference well bge reranker and a good test curl answer scores :
(HF_TGI) dromeuf@MAIA:~$ sudo docker run --gpus all -p 33435:80 -v /home/dromeuf/.cache/huggingface:/data ghcr.io/huggingface/text-embeddings-inference:latest --model-id BAAI/bge-reranker-v2-m3
dromeuf@MAIA:~$ curl http://localhost:33435/rerank \
-X POST \
-d '{"query": "Effet Doppler et exoplanètes ?", "texts": ["Doppler et étoiles à neutrons", "Méthode Doppler pour les exoplanètes", "Histoire de ''effet Doppler", "Transit et Doppler en astronomie", "Applications médicales de l''effet Doppler"]}' \
-H 'Content-Type: application/json'
[{"index":1,"score":0.9920312},{"index":2,"score":0.22388572},{"index":3,"score":0.21900326},{"index":0,"score":0.091058284},{"index":4,"score":0.012970387}]
And TEI for castorini/monot5-large-msmarco return error. When I try it in OpenWebUI there is no error when I configure it as reranker Documents. If I use TGI (not TEI) the castorini/monot5-large-msmarco download but rerank request return empty :
(HF_TGI) dromeuf@MAIA:~$ sudo docker run --gpus all -p 33435:80 ghcr.io/huggingface/text-embeddings-inference:latest --model-id castorini/monot5-large-msmarco
2025-03-01T16:41:10.297134Z INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "cas******/******-*****-****rco", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "bf4d103a90d8", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2025-03-01T16:41:10.297254Z INFO hf_hub: /root/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"
2025-03-01T16:41:10.367215Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download
2025-03-01T16:41:10.367240Z INFO download_artifacts:download_pool_config: text_embeddings_core::download: core/src/download.rs:53: Downloading `1_Pooling/config.json`
2025-03-01T16:41:10.591346Z WARN download_artifacts: text_embeddings_core::download: core/src/download.rs:26: Download failed: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/castorini/monot5-large-msmarco/resolve/main/1_Pooling/config.json)
2025-03-01T16:41:11.809644Z INFO download_artifacts:download_new_st_config: text_embeddings_core::download: core/src/download.rs:77: Downloading `config_sentence_transformers.json`
2025-03-01T16:41:12.847881Z WARN download_artifacts: text_embeddings_core::download: core/src/download.rs:36: Download failed: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/castorini/monot5-large-msmarco/resolve/main/config_sentence_transformers.json)
2025-03-01T16:41:12.847907Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:40: Downloading `config.json`
2025-03-01T16:41:13.300426Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:43: Downloading `tokenizer.json`
Error: Could not download model artifacts
Caused by:
0: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/castorini/monot5-large-msmarco/resolve/main/tokenizer.json)
1: HTTP status client error (404 Not Found) for url (https://huggingface.co/castorini/monot5-large-msmarco/resolve/main/tokenizer.json)
I'll tell castorini !
Kind regards, David.
About re-rank model, I‘m not optimistic for its performace meanwhile dragging down the speed of searching.