QAnything icon indicating copy to clipboard operation
QAnything copied to clipboard

[BUG] 使用python版本在Linux上运行本地模型时出现“The client socket has failed to connect to [::ffff:0.0.188.125]:59771 (errno: 22 - Invalid argument).”

Open hingkan opened this issue 8 months ago • 0 comments

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • [X] 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

创建干净的Python虚拟环境后,安装需要的包。运行:bash scripts/run_for_7B_in_Linux_or_WSL.sh

期望行为 | Expected Behavior

运行成功,并能通过网站访问前端页面,或使用API。

运行环境 | Environment

- OS:CentOS Linux 8
- NVIDIA Driver:550.54.14
- CUDA:12.1
- docker:25.0.4
- docker-compose:2.24.7
- NVIDIA GPU:NVIDIA GeForce RTX 3090
- NVIDIA GPU Memory:24GB

QAnything日志 | QAnything logs

即将启动后端服务,启动成功后请复制[http://0.0.0.0:8777/qanything/]到浏览器进行测试。 运行qanything-server的命令是: CUDA_VISIBLE_DEVICES=0 python3 -m qanything_kernel.qanything_server.sanic_api --host 0.0.0.0 --port 8777 --model_size 7B LOCAL DATA PATH: /home//mnt/workspace/QAnything/QANY_DB/content LOCAL_RERANK_REPO: netease-youdao/bce-reranker-base_v1 LOCAL_EMBED_REPO: netease-youdao/bce-embedding-base_v1 <Logger debug_logger (INFO)> <Logger qa_logger (INFO)> 2024-06-04 10:13:07,353 - modelscope - INFO - PyTorch version 2.1.2 Found. 2024-06-04 10:13:07,354 - modelscope - INFO - Loading ast index from /home//.cache/modelscope/ast_indexer 2024-06-04 10:13:07,389 - modelscope - INFO - Loading done! Current index file version is 1.13.0, with md5 6bd8910d8bf93d5739004f525912b700 and a total number of 972 components indexed use_cpu: False use_openai_api: False onnxruntime-gpu 1.17.1 已经安装。 vllm 0.2.7 已经安装。 functions:[{'name': 'duckduckgo_search', 'description': 'duckduckgo_search(query: str) - Search infomation on internet. Useful for when the context can not answer the question. Input should be a search query.', 'parameters': {'type': 'object', 'properties': {'query': {'description': 'search query', 'type': 'string'}}, 'required': ['query']}}] [2024-06-04 10:13:19 +0800] [730396] [WARNING] Sanic is running in PRODUCTION mode. Consider using '--debug' or '--dev' while actively developing your application. [2024-06-04 10:13:19 +0800] [730396] [INFO] Sanic Extensions: [2024-06-04 10:13:19 +0800] [730396] [INFO] > injection [0 dependencies; 0 constants] [2024-06-04 10:13:19 +0800] [730396] [INFO] > openapi [http://0.0.0.0:8777/docs] [2024-06-04 10:13:19 +0800] [730396] [INFO] > http [2024-06-04 10:13:19 +0800] [730396] [INFO] > templating [jinja2==3.1.4] INFO 06-04 10:13:19 llm_engine.py:70] Initializing an LLM engine with config: model='/home//mnt/workspace/QAnything/assets/custom_models/netease-youdao/Qwen-7B-QAnything', tokenizer='/home//mnt/workspace/QAnything/assets/custom_models/netease-youdao/Qwen-7B-QAnything', tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=8192, download_dir=None, load_format=auto, tensor_parallel_size=1, quantization=None, enforce_eager=False, seed=0) [W socket.cpp:663] [c10d] The client socket has failed to connect to [::ffff:0.0.188.125]:59771 (errno: 22 - Invalid argument). [W socket.cpp:663] [c10d] The client socket has failed to connect to 0.0.188.125:59771 (errno: 22 - Invalid argument). [E socket.cpp:719] [c10d] The client socket has failed to connect to any network address of (0.0.188.125, 59771). [2024-06-04 10:13:20 +0800] [730396] [ERROR] Experienced exception while trying to serve Traceback (most recent call last): File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/mixins/startup.py", line 958, in serve_single worker_serve(monitor_publisher=None, kwargs) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/worker/serve.py", line 143, in worker_serve raise e File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/worker/serve.py", line 117, in worker_serve return _serve_http_1( File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/server/runners.py", line 223, in _serve_http_1 loop.run_until_complete(app._server_event("init", "before")) File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/app.py", line 1764, in _server_event await self.dispatch( File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/signals.py", line 208, in dispatch return await dispatch File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/signals.py", line 183, in _dispatch raise e File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/signals.py", line 167, in _dispatch retval = await maybe_coroutine File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/app.py", line 1315, in _listener await maybe_coro File "/home//mnt/workspace/QAnything/qanything_kernel/qanything_server/sanic_api.py", line 199, in init_local_doc_qa local_doc_qa.init_cfg(args=args) File "/home//mnt/workspace/QAnything/qanything_kernel/core/local_doc_qa.py", line 61, in init_cfg self.llm: OpenAICustomLLM = OpenAICustomLLM(args) File "/home//mnt/workspace/QAnything/qanything_kernel/connector/llm/llm_for_fastchat.py", line 40, in init self.engine = AsyncLLMEngine.from_engine_args(engine_args) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 500, in from_engine_args engine = cls(parallel_config.worker_use_ray, File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 273, in init self.engine = self._init_engine(args, kwargs) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 318, in _init_engine return engine_class(args, kwargs) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 111, in init self._init_workers() File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 145, in _init_workers self._run_workers("init_model") File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 795, in _run_workers driver_worker_output = getattr(self.driver_worker, File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/worker/worker.py", line 74, in init_model _init_distributed_environment(self.parallel_config, self.rank, File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/worker/worker.py", line 212, in _init_distributed_environment torch.distributed.init_process_group( File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/c10d_logger.py", line 74, in wrapper func_return = func(args, kwargs) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 1141, in init_process_group store, rank, world_size = next(rendezvous_iterator) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/rendezvous.py", line 196, in _tcp_rendezvous_handler store = _create_c10d_store(result.hostname, result.port, rank, world_size, timeout) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/rendezvous.py", line 172, in _create_c10d_store return TCPStore( RuntimeError: The client socket has failed to connect to any network address of (0.0.188.125, 59771). The client socket has failed to connect to 0.0.188.125:59771 (errno: 22 - Invalid argument). [2024-06-04 10:13:20 +0800] [730396] [INFO] Server Stopped Traceback (most recent call last): File "/home//anaconda3/envs/qanything-python/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/home//anaconda3/envs/qanything-python/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home//mnt/workspace/QAnything/qanything_kernel/qanything_server/sanic_api.py", line 240, in app.run(host=args.host, port=args.port, single_process=True, access_log=False) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/mixins/startup.py", line 215, in run serve(primary=self) # type: ignore File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/mixins/startup.py", line 958, in serve_single worker_serve(monitor_publisher=None, kwargs) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/worker/serve.py", line 143, in worker_serve raise e File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/worker/serve.py", line 117, in worker_serve return _serve_http_1( File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/server/runners.py", line 223, in _serve_http_1 loop.run_until_complete(app._server_event("init", "before")) File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/app.py", line 1764, in _server_event await self.dispatch( File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/signals.py", line 208, in dispatch return await dispatch File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/signals.py", line 183, in _dispatch raise e File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/signals.py", line 167, in _dispatch retval = await maybe_coroutine File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/app.py", line 1315, in _listener await maybe_coro File "/home//mnt/workspace/QAnything/qanything_kernel/qanything_server/sanic_api.py", line 199, in init_local_doc_qa local_doc_qa.init_cfg(args=args) File "/home//mnt/workspace/QAnything/qanything_kernel/core/local_doc_qa.py", line 61, in init_cfg self.llm: OpenAICustomLLM = OpenAICustomLLM(args) File "/home//mnt/workspace/QAnything/qanything_kernel/connector/llm/llm_for_fastchat.py", line 40, in init self.engine = AsyncLLMEngine.from_engine_args(engine_args) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 500, in from_engine_args engine = cls(parallel_config.worker_use_ray, File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 273, in init self.engine = self._init_engine(args, kwargs) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 318, in _init_engine return engine_class(args, kwargs) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 111, in init self._init_workers() File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 145, in _init_workers self._run_workers("init_model") File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 795, in _run_workers driver_worker_output = getattr(self.driver_worker, File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/worker/worker.py", line 74, in init_model _init_distributed_environment(self.parallel_config, self.rank, File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/worker/worker.py", line 212, in _init_distributed_environment torch.distributed.init_process_group( File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/c10d_logger.py", line 74, in wrapper func_return = func(args, kwargs) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 1141, in init_process_group store, rank, world_size = next(rendezvous_iterator) File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/rendezvous.py", line 196, in _tcp_rendezvous_handler store = _create_c10d_store(result.hostname, result.port, rank, world_size, timeout) File "/home/**/anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/rendezvous.py", line 172, in _create_c10d_store return TCPStore( RuntimeError: The client socket has failed to connect to any network address of (0.0.188.125, 59771). The client socket has failed to connect to 0.0.188.125:59771 (errno: 22 - Invalid argument).

复现方法 | Steps To Reproduce

  1. 安装环境 conda create -n qanything-python python=3.10 conda activate qanything-python git clone -b qanything-python https://github.com/netease-youdao/QAnything.git cd QAnything pip install -r requirements.txt

  2. 到 “ https://www.modelscope.cn/models/netease-youdao/QAnything-pdf-parser/summary ” 下载QAnything PDF解析相关模型,并放入指定位置;

  3. 运行命令(以下两者报错一样): bash scripts/run_for_3B_in_Linux_or_WSL.sh bash scripts/run_for_7B_in_Linux_or_WSL.sh

备注 | Anything else?

No response

hingkan avatar Jun 04 '24 08:06 hingkan