[Issue]: <title> urllib3.exceptions.ConnectTimeoutError
Is there an existing issue for this?
- [ ] I have searched the existing issues
- [ ] I have checked #657 to validate if my issue is covered by community support
Describe the issue
I used the other llm and embedding openapi and set the base url in the settings,yaml. When start index got the connectTimeout Error .I do not know where I used the url openaipublic.blob.core.windows.net.
Steps to reproduce
No response
**### GraphRAG Config Used encoding_model: bge-m3-local skip_workflows: [] llm: api_key: xinference type: openai_chat # or azure_openai_chat model: glm4-9b model_supports_json: true # recommended if this is available for your model.
max_tokens: 4000
request_timeout: 180.0
api_base: http://192.168.24.137:30123/open-api/glm4/v1
api_version: 2024-02-15-preview
organization: <organization_id>
deployment_name: <azure_model_deployment_name>
tokens_per_minute: 150_000 # set a leaky bucket throttle
requests_per_minute: 10_000 # set a leaky bucket throttle
max_retries: 10
max_retry_wait: 10.0
sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
concurrent_requests: 25 # the number of parallel inflight requests that may be made
temperature: 0 # temperature for sampling
top_p: 1 # top-p sampling
n: 1 # Number of completions to generate
parallelization: stagger: 0.3
num_threads: 50 # the number of threads to use for parallel processing
async_mode: threaded # or asyncio
embeddings:
parallelization: override the global parallelization settings for embeddings
async_mode: threaded # or asyncio llm: api_key: xinference type: openai_embedding # or azure_openai_embedding model: bge-m3-local api_base: http://192.168.24.137:9998/v1 # api_version: 2024-02-15-preview # organization: <organization_id> # deployment_name: <azure_model_deployment_name> # tokens_per_minute: 150_000 # set a leaky bucket throttle # requests_per_minute: 10_000 # set a leaky bucket throttle # max_retries: 10 # max_retry_wait: 10.0 # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times # concurrent_requests: 25 # the number of parallel inflight requests that may be made # batch_size: 16 # the number of documents to send in a single request # batch_max_tokens: 8191 # the maximum number of tokens to send in a single request # target: required # or optional**
Logs and screenshots
17:27:40,699 graphrag.config.read_dotenv INFO Loading pipeline .env file 17:27:40,702 graphrag.index.cli INFO using default configuration: { "llm": { "api_key": "REDACTED, length 10", "type": "openai_chat", "model": "glm4-9b", "max_tokens": 4000, "temperature": 0.0, "top_p": 1.0, "n": 1, "request_timeout": 180.0, "api_base": "http://192.168.24.137:30123/open-api/glm4/v1", "api_version": null, "proxy": null, "cognitive_services_endpoint": null, "deployment_name": null, "model_supports_json": true, "tokens_per_minute": 0, "requests_per_minute": 0, "max_retries": 10, "max_retry_wait": 10.0, "sleep_on_rate_limit_recommendation": true, "concurrent_requests": 25 }, "parallelization": { "stagger": 0.3, "num_threads": 50 }, "async_mode": "threaded", "root_dir": "./ragtest", "reporting": { "type": "file", "base_dir": "output/${timestamp}/reports", "storage_account_blob_url": null }, "storage": { "type": "file", "base_dir": "output/${timestamp}/artifacts", "storage_account_blob_url": null }, "cache": { "type": "file", "base_dir": "cache", "storage_account_blob_url": null }, "input": { "type": "file", "file_type": "text", "base_dir": "input", "storage_account_blob_url": null, "encoding": "utf-8", "file_pattern": ".\.txt$", "file_filter": null, "source_column": null, "timestamp_column": null, "timestamp_format": null, "text_column": "text", "title_column": null, "document_attribute_columns": [] }, "embed_graph": { "enabled": false, "num_walks": 10, "walk_length": 40, "window_size": 2, "iterations": 3, "random_seed": 597832, "strategy": null }, "embeddings": { "llm": { "api_key": "REDACTED, length 10", "type": "openai_embedding", "model": "bge-m3-local", "max_tokens": 4000, "temperature": 0, "top_p": 1, "n": 1, "request_timeout": 180.0, "api_base": "http://192.168.24.137:9998/v1", "api_version": null, "proxy": null, "cognitive_services_endpoint": null, "deployment_name": null, "model_supports_json": null, "tokens_per_minute": 0, "requests_per_minute": 0, "max_retries": 10, "max_retry_wait": 10.0, "sleep_on_rate_limit_recommendation": true, "concurrent_requests": 25 }, "parallelization": { "stagger": 0.3, "num_threads": 50 }, "async_mode": "threaded", "batch_size": 16, "batch_max_tokens": 8191, "target": "required", "skip": [], "vector_store": null, "strategy": null }, "chunks": { "size": 1200, "overlap": 100, "group_by_columns": [ "id" ], "strategy": null }, "snapshots": { "graphml": false, "raw_entities": false, "top_level_nodes": false }, "entity_extraction": { "llm": { "api_key": "REDACTED, length 10", "type": "openai_chat", "model": "glm4-9b", "max_tokens": 4000, "temperature": 0.0, "top_p": 1.0, "n": 1, "request_timeout": 180.0, "api_base": "http://192.168.24.137:30123/open-api/glm4/v1", "api_version": null, "proxy": null, "cognitive_services_endpoint": null, "deployment_name": null, "model_supports_json": true, "tokens_per_minute": 0, "requests_per_minute": 0, "max_retries": 10, "max_retry_wait": 10.0, "sleep_on_rate_limit_recommendation": true, "concurrent_requests": 25 }, "parallelization": { "stagger": 0.3, "num_threads": 50 }, "async_mode": "threaded", "prompt": "prompts/entity_extraction.txt", "entity_types": [ "organization", "person", "geo", "event" ], "max_gleanings": 1, "strategy": null }, "summarize_descriptions": { "llm": { "api_key": "REDACTED, length 10", "type": "openai_chat", "model": "glm4-9b", "max_tokens": 4000, "temperature": 0.0, "top_p": 1.0, "n": 1, "request_timeout": 180.0, "api_base": "http://192.168.24.137:30123/open-api/glm4/v1", "api_version": null, "proxy": null, "cognitive_services_endpoint": null, "deployment_name": null, "model_supports_json": true, "tokens_per_minute": 0, "requests_per_minute": 0, "max_retries": 10, "max_retry_wait": 10.0, "sleep_on_rate_limit_recommendation": true, "concurrent_requests": 25 }, "parallelization": { "stagger": 0.3, "num_threads": 50 }, "async_mode": "threaded", "prompt": "prompts/summarize_descriptions.txt", "max_length": 500, "strategy": null }, "community_reports": { "llm": { "api_key": "REDACTED, length 10", "type": "openai_chat", "model": "glm4-9b", "max_tokens": 4000, "temperature": 0.0, "top_p": 1.0, "n": 1, "request_timeout": 180.0, "api_base": "http://192.168.24.137:30123/open-api/glm4/v1", "api_version": null, "proxy": null, "cognitive_services_endpoint": null, "deployment_name": null, "model_supports_json": true, "tokens_per_minute": 0, "requests_per_minute": 0, "max_retries": 10, "max_retry_wait": 10.0, "sleep_on_rate_limit_recommendation": true, "concurrent_requests": 25 }, "parallelization": { "stagger": 0.3, "num_threads": 50 }, "async_mode": "threaded", "prompt": "prompts/community_report.txt", "max_length": 2000, "max_input_length": 8000, "strategy": null }, "claim_extraction": { "llm": { "api_key": "REDACTED, length 10", "type": "openai_chat", "model": "glm4-9b", "max_tokens": 4000, "temperature": 0.0, "top_p": 1.0, "n": 1, "request_timeout": 180.0, "api_base": "http://192.168.24.137:30123/open-api/glm4/v1", "api_version": null, "proxy": null, "cognitive_services_endpoint": null, "deployment_name": null, "model_supports_json": true, "tokens_per_minute": 0, "requests_per_minute": 0, "max_retries": 10, "max_retry_wait": 10.0, "sleep_on_rate_limit_recommendation": true, "concurrent_requests": 25 }, "parallelization": { "stagger": 0.3, "num_threads": 50 }, "async_mode": "threaded", "enabled": false, "prompt": "prompts/claim_extraction.txt", "description": "Any claims or facts that could be relevant to information discovery.", "max_gleanings": 1, "strategy": null }, "cluster_graph": { "max_cluster_size": 10, "strategy": null }, "umap": { "enabled": false }, "local_search": { "text_unit_prop": 0.5, "community_prop": 0.1, "conversation_history_max_turns": 5, "top_k_entities": 10, "top_k_relationships": 10, "temperature": 0.0, "top_p": 1.0, "n": 1, "max_tokens": 12000, "llm_max_tokens": 2000 }, "global_search": { "temperature": 0.0, "top_p": 1.0, "n": 1, "max_tokens": 12000, "data_max_tokens": 12000, "map_max_tokens": 1000, "reduce_max_tokens": 2000, "concurrency": 32 }, "encoding_model": "bge-m3-local", "skip_workflows": [] } 17:27:40,704 graphrag.index.create_pipeline_config INFO skipping workflows 17:27:40,704 graphrag.index.run INFO Running pipeline 17:27:40,705 graphrag.index.storage.file_pipeline_storage INFO Creating file storage at ragtest\output\20240801-172740\artifacts 17:27:40,705 graphrag.index.input.load_input INFO loading input from root_dir=input 17:27:40,705 graphrag.index.input.load_input INFO using file storage for input 17:27:40,706 graphrag.index.storage.file_pipeline_storage INFO search ragtest\input for files matching ..txt$ 17:27:40,707 graphrag.index.input.text INFO found text files from input, found [('老庄墨韩1.txt', {})] 17:27:40,738 graphrag.index.input.text INFO Found 1 files, loading 1 17:27:40,740 graphrag.index.workflows.load INFO Workflow Run Order: ['create_base_text_units', 'create_base_extracted_entities', 'create_summarized_entities', 'create_base_entity_graph', 'create_final_entities', 'create_final_nodes', 'create_final_communities', 'join_text_units_to_entity_ids', 'create_final_relationships', 'join_text_units_to_relationship_ids', 'create_final_community_reports', 'create_final_text_units', 'create_base_documents', 'create_final_documents'] 17:27:40,740 graphrag.index.run INFO Final # of rows loaded: 1 17:27:40,836 graphrag.index.run INFO Running workflow: create_base_text_units... 17:27:40,836 graphrag.index.run INFO dependencies for create_base_text_units: [] 17:27:40,840 datashaper.workflow.workflow INFO executing verb orderby 17:27:40,843 datashaper.workflow.workflow INFO executing verb zip 17:27:40,845 datashaper.workflow.workflow INFO executing verb aggregate_override 17:27:40,849 datashaper.workflow.workflow INFO executing verb chunk 17:28:01,895 datashaper.workflow.workflow ERROR Error executing verb "chunk" in create_base_text_units: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)')) Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connection.py", line 196, in _new_conn sock = connection.create_connection( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\util\connection.py", line 85, in create_connection raise err File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\util\connection.py", line 73, in create_connection sock.connect(sa) TimeoutError: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 789, in urlopen response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 490, in _make_request raise new_e File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 466, in _make_request self._validate_conn(conn) File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 1095, in _validate_conn conn.connect() File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connection.py", line 615, in connect self.sock = sock = self._new_conn() ^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connection.py", line 205, in _new_conn raise ConnectTimeoutError( urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\adapters.py", line 667, in send resp = conn.urlopen( ^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 843, in urlopen retries = retries.increment( ^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\util\retry.py", line 519, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\datashaper\workflow\workflow.py", line 410, in _execute_verb
result = node.verb.func(**verb_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\verbs\text\chunk\text_chunk.py", line 98, in chunk
output[to] = output.apply(
^^^^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\frame.py", line 10374, in apply
return op.apply().finalize(self, method="apply")
^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\apply.py", line 916, in apply
return self.apply_standard()
^^^^^^^^^^^^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\apply.py", line 1063, in apply_standard
results, res_index = self.apply_series_generator()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\apply.py", line 1081, in apply_series_generator
results[i] = self.func(v, *self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\verbs\text\chunk\text_chunk.py", line 101, in
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 789, in urlopen response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 490, in _make_request raise new_e File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 466, in _make_request self._validate_conn(conn) File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 1095, in _validate_conn conn.connect() File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connection.py", line 615, in connect self.sock = sock = self._new_conn() ^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connection.py", line 205, in _new_conn raise ConnectTimeoutError( urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\adapters.py", line 667, in send resp = conn.urlopen( ^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 843, in urlopen retries = retries.increment( ^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\util\retry.py", line 519, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\run.py", line 323, in run_pipeline
result = await workflow.run(context, callbacks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\datashaper\workflow\workflow.py", line 369, in run
timing = await self._execute_verb(node, context, callbacks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\datashaper\workflow\workflow.py", line 410, in _execute_verb
result = node.verb.func(**verb_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\verbs\text\chunk\text_chunk.py", line 98, in chunk
output[to] = output.apply(
^^^^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\frame.py", line 10374, in apply
return op.apply().finalize(self, method="apply")
^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\apply.py", line 916, in apply
return self.apply_standard()
^^^^^^^^^^^^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\apply.py", line 1063, in apply_standard
results, res_index = self.apply_series_generator()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\apply.py", line 1081, in apply_series_generator
results[i] = self.func(v, *self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\verbs\text\chunk\text_chunk.py", line 101, in
Additional Information
- GraphRAG Version:0.2.0
- Operating System:windows11
- Python Version:3.11.9
- Related Issues: