ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Question]: FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/FileNotFoundError: [Errno 2] No such file or directory: '/ragflow/rag/res/deepdoc/ocr.res'be0c1e50eef6047b412d1800aa89aba4d275f997/ocr.res'

Open hanxl1 opened this issue 1 year ago • 11 comments
trafficstars

Describe your problem

一直报没有这连个文件,这是什么文件

hanxl1 avatar Apr 16 '24 06:04 hanxl1

我也一直报这个错误,我是M1芯片的mac。 [WARNING] Load term.freq FAIL! Traceback (most recent call last): File "/ragflow/deepdoc/vision/ocr.py", line 486, in init self.text_detector = TextDetector(model_dir) ^^^^^^^^^^^^^^^^^^^^^^^ File "/ragflow/deepdoc/vision/ocr.py", line 381, in init self.predictor, self.input_tensor = load_model(model_dir, 'det') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ragflow/deepdoc/vision/ocr.py", line 65, in load_model raise ValueError("not find model file path {}".format( ValueError: not find model file path /ragflow/rag/res/deepdoc/det.onnx

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/ragflow/rag/svr/task_executor.py", line 44, in from rag.app import laws, paper, presentation, manual, qa, table, book, resume, picture, naive, one File "/ragflow/rag/app/picture.py", line 23, in ocr = OCR() ^^^^^ File "/ragflow/deepdoc/vision/ocr.py", line 491, in init self.text_recognizer = TextRecognizer(model_dir) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ragflow/deepdoc/vision/ocr.py", line 95, in init self.postprocess_op = build_post_process(postprocess_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ragflow/deepdoc/vision/postprocess.py", line 20, in build_post_process module_class = eval(module_name)(**config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ragflow/deepdoc/vision/postprocess.py", line 335, in init super(CTCLabelDecode, self).init(character_dict_path, File "/ragflow/deepdoc/vision/postprocess.py", line 258, in init with open(character_dict_path, "rb") as fin: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997/ocr.res'

不知道如何解决

FelixLeeeeee avatar Apr 16 '24 06:04 FelixLeeeeee

手动把文件拷贝到容器里面 docker cp ocr.res ragflow-server:/root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997/ocr.res

limate avatar Apr 16 '24 08:04 limate

手动把文件拷贝到容器里面 docker cp ocr.res ragflow-server:/root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997/ocr.res

ValueError: not find model file path /root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997/layout.onnx

又报这个错了,这个文件在哪里呢

hanxl1 avatar Apr 16 '24 08:04 hanxl1

手动把文件拷贝到容器里面 docker cp ocr.res ragflow-server:/root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997/ocr.res

现在又在报这个错 [INFO] [2024-04-16 16:59:18,207] [_internal._log] [line:96]: 172.19.0.6 - - [16/Apr/2024 16:59:18] "GET /v1/document/list?kb_id=13894c00fb0d11ee8d070242ac190006&page=1&page_size=10 HTTP/1.1" 200 - Traceback (most recent call last): File "/ragflow/deepdoc/vision/layout_recognizer.py", line 44, in init super().init(self.labels, domain, model_dir) File "/ragflow/deepdoc/vision/recognizer.py", line 50, in init raise ValueError("not find model file path {}".format( ValueError: not find model file path /ragflow/rag/res/deepdoc/layout.onnx

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/ragflow/rag/svr/task_executor.py", line 130, in build cks = chunker.chunk(row["name"], binary=binary, from_page=row["from_page"], ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ragflow/rag/app/naive.py", line 128, in chunk pdf_parser = Pdf( ^^^^ File "/ragflow/deepdoc/parser/pdf_parser.py", line 32, in init self.layouter = LayoutRecognizer("layout") ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ragflow/deepdoc/vision/layout_recognizer.py", line 47, in init super().init(self.labels, domain, model_dir) File "/ragflow/deepdoc/vision/recognizer.py", line 50, in init raise ValueError("not find model file path {}".format( ValueError: not find model file path /root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997/layout.onnx 请问这个文件在哪里呢

FelixLeeeeee avatar Apr 16 '24 09:04 FelixLeeeeee

may be it will help you https://github.com/infiniflow/ragflow/issues/323#issuecomment-2050800223

ooooo-create avatar Apr 16 '24 09:04 ooooo-create

I am also using Mac.

This is because the base docker image "ragflow-base:v1.0" has a old hugggingface cache of the deepdoc model (snapshot be0c1e50eef6047b412d1800aa89aba4d275f997). Actually, it also missed the other model 'text_concat_xgb_v1.0'.

Before the base image being updated, you can try to fix this issue in the docker/entrypoint.sh (ensure this file is executable), by adding the following instructions to the top:

export PATH=/root/miniconda3/envs/py11/bin/:/root/miniconda3/bin:/root/miniconda3/condabin:$PATH
huggingface-cli download InfiniFlow/deepdoc
huggingface-cli download InfiniFlow/text_concat_xgb_v1.0
rm -rf /root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997/

And update docker-compose.yml

`--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
     ports:
       - ${SVR_HTTP_PORT}:9380
       - 80:80
       - 443:443
     volumes:
+      - ./entrypoint.sh:/ragflow/entrypoint.sh
       - ./service_conf.yaml:/ragflow/conf/service_conf.yaml
       - ./ragflow-logs:/ragflow/logs
       - ./nginx/ragflow.conf:/etc/nginx/conf.d/ragflow.conf

This should help clear the old cache and fetch the two models.

The same logic can also be used to update the Dockerfile to build your own docker image.

oreh avatar Apr 17 '24 01:04 oreh

docker cp ocr.res ragflow-server:/root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997/ocr.res

请问是在哪运行这个命令

Gutilence14 avatar Apr 18 '24 02:04 Gutilence14

I am also using Mac.

This is because the base docker image "ragflow-base:v1.0" has a old hugggingface cache of the deepdoc model (snapshot be0c1e50eef6047b412d1800aa89aba4d275f997). Actually, it also missed the other model 'text_concat_xgb_v1.0'.

Before the base image being updated, you can try to fix this issue in the docker/entrypoint.sh (ensure this file is executable), by adding the following instructions to the top:

export PATH=/root/miniconda3/envs/py11/bin/:/root/miniconda3/bin:/root/miniconda3/condabin:$PATH
huggingface-cli download InfiniFlow/deepdoc
huggingface-cli download InfiniFlow/text_concat_xgb_v1.0
rm -rf /root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997/

And update docker-compose.yml

`--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
     ports:
       - ${SVR_HTTP_PORT}:9380
       - 80:80
       - 443:443
     volumes:
+      - ./entrypoint.sh:/ragflow/entrypoint.sh
       - ./service_conf.yaml:/ragflow/conf/service_conf.yaml
       - ./ragflow-logs:/ragflow/logs
       - ./nginx/ragflow.conf:/etc/nginx/conf.d/ragflow.conf

This should help clear the old cache and fetch the two models.

The same logic can also be used to update the Dockerfile to build your own docker image.

Oh my god, thank you so much. It works! I changed the file that you referenced, and i changed the env value what call HF_ENDPOINT to download the Ocr models. Then i restart the docker. Service up success.

FelixLeeeeee avatar Apr 18 '24 02:04 FelixLeeeeee

给大家统一解答下: 去huggingface上把infiniflow里面的那几个文件都下载到你服务器宿主机的某个目录下,然后映射进所在容器的路径/root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997

limate avatar Apr 18 '24 02:04 limate

实测最简单的解决办法 1、打开entrypoint.sh 2、把PY=/root/miniconda3/envs/py11/bin/python这一行替换成你的环境路径PY=/home/YOURNAME/miniconda3/envs/py11/bin/python

Gutilence14 avatar Apr 18 '24 02:04 Gutilence14

  1. 如解析进度小于1%,不能登录huggingface,则需下载文件deepdoctext_concat_xgb_v1.0,放到docker文件夹下, 在配置文件docker-compose.yaml中添加如下信息:
    - ./deepdoc:/ragflow/rag/res/deepdoc
    - ./text_concat_xgb_v1.0:/ragflow/rag/res/deepdoc
    

    重启docker使其生效:

    $ docker compose up -d
    
  2. 如解析仍然小于1%, 可能是deepdoc文件有遗漏,则需要手动移动本机的deepdoc文件到docker容器中
    find deepdoc -maxdepth 1 -type f -exec docker cp {} ragflow-server:/root/.cache/huggingface/hub/models--InfiniFlow-- 
    deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997 \;
    
    现在就可以正常解析文件了,大功告成! 注意:国内尽量选择中文版安装,即docker compose -f docker-compose-CN.yml up -d
    英文版安装容易出现连接不了huggingface出错等问题。以上是英文安装的补救措施
    亲测有效!!!

Miki-lin avatar Apr 18 '24 02:04 Miki-lin

Fixed

JinHai-CN avatar May 19 '24 02:05 JinHai-CN