graphrag icon indicating copy to clipboard operation
graphrag copied to clipboard

[Issue]: <title> urllib3.exceptions.ConnectTimeoutError

Open tyzhong opened this issue 1 year ago • 0 comments

Is there an existing issue for this?

  • [ ] I have searched the existing issues
  • [ ] I have checked #657 to validate if my issue is covered by community support

Describe the issue

I used the other llm and embedding openapi and set the base url in the settings,yaml. When start index got the connectTimeout Error .I do not know where I used the url openaipublic.blob.core.windows.net.

Steps to reproduce

No response

**### GraphRAG Config Used encoding_model: bge-m3-local skip_workflows: [] llm: api_key: xinference type: openai_chat # or azure_openai_chat model: glm4-9b model_supports_json: true # recommended if this is available for your model.

max_tokens: 4000

request_timeout: 180.0

api_base: http://192.168.24.137:30123/open-api/glm4/v1

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

temperature: 0 # temperature for sampling

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate

parallelization: stagger: 0.3

num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:

parallelization: override the global parallelization settings for embeddings

async_mode: threaded # or asyncio llm: api_key: xinference type: openai_embedding # or azure_openai_embedding model: bge-m3-local api_base: http://192.168.24.137:9998/v1 # api_version: 2024-02-15-preview # organization: <organization_id> # deployment_name: <azure_model_deployment_name> # tokens_per_minute: 150_000 # set a leaky bucket throttle # requests_per_minute: 10_000 # set a leaky bucket throttle # max_retries: 10 # max_retry_wait: 10.0 # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times # concurrent_requests: 25 # the number of parallel inflight requests that may be made # batch_size: 16 # the number of documents to send in a single request # batch_max_tokens: 8191 # the maximum number of tokens to send in a single request # target: required # or optional**

Logs and screenshots

17:27:40,699 graphrag.config.read_dotenv INFO Loading pipeline .env file 17:27:40,702 graphrag.index.cli INFO using default configuration: { "llm": { "api_key": "REDACTED, length 10", "type": "openai_chat", "model": "glm4-9b", "max_tokens": 4000, "temperature": 0.0, "top_p": 1.0, "n": 1, "request_timeout": 180.0, "api_base": "http://192.168.24.137:30123/open-api/glm4/v1", "api_version": null, "proxy": null, "cognitive_services_endpoint": null, "deployment_name": null, "model_supports_json": true, "tokens_per_minute": 0, "requests_per_minute": 0, "max_retries": 10, "max_retry_wait": 10.0, "sleep_on_rate_limit_recommendation": true, "concurrent_requests": 25 }, "parallelization": { "stagger": 0.3, "num_threads": 50 }, "async_mode": "threaded", "root_dir": "./ragtest", "reporting": { "type": "file", "base_dir": "output/${timestamp}/reports", "storage_account_blob_url": null }, "storage": { "type": "file", "base_dir": "output/${timestamp}/artifacts", "storage_account_blob_url": null }, "cache": { "type": "file", "base_dir": "cache", "storage_account_blob_url": null }, "input": { "type": "file", "file_type": "text", "base_dir": "input", "storage_account_blob_url": null, "encoding": "utf-8", "file_pattern": ".\.txt$", "file_filter": null, "source_column": null, "timestamp_column": null, "timestamp_format": null, "text_column": "text", "title_column": null, "document_attribute_columns": [] }, "embed_graph": { "enabled": false, "num_walks": 10, "walk_length": 40, "window_size": 2, "iterations": 3, "random_seed": 597832, "strategy": null }, "embeddings": { "llm": { "api_key": "REDACTED, length 10", "type": "openai_embedding", "model": "bge-m3-local", "max_tokens": 4000, "temperature": 0, "top_p": 1, "n": 1, "request_timeout": 180.0, "api_base": "http://192.168.24.137:9998/v1", "api_version": null, "proxy": null, "cognitive_services_endpoint": null, "deployment_name": null, "model_supports_json": null, "tokens_per_minute": 0, "requests_per_minute": 0, "max_retries": 10, "max_retry_wait": 10.0, "sleep_on_rate_limit_recommendation": true, "concurrent_requests": 25 }, "parallelization": { "stagger": 0.3, "num_threads": 50 }, "async_mode": "threaded", "batch_size": 16, "batch_max_tokens": 8191, "target": "required", "skip": [], "vector_store": null, "strategy": null }, "chunks": { "size": 1200, "overlap": 100, "group_by_columns": [ "id" ], "strategy": null }, "snapshots": { "graphml": false, "raw_entities": false, "top_level_nodes": false }, "entity_extraction": { "llm": { "api_key": "REDACTED, length 10", "type": "openai_chat", "model": "glm4-9b", "max_tokens": 4000, "temperature": 0.0, "top_p": 1.0, "n": 1, "request_timeout": 180.0, "api_base": "http://192.168.24.137:30123/open-api/glm4/v1", "api_version": null, "proxy": null, "cognitive_services_endpoint": null, "deployment_name": null, "model_supports_json": true, "tokens_per_minute": 0, "requests_per_minute": 0, "max_retries": 10, "max_retry_wait": 10.0, "sleep_on_rate_limit_recommendation": true, "concurrent_requests": 25 }, "parallelization": { "stagger": 0.3, "num_threads": 50 }, "async_mode": "threaded", "prompt": "prompts/entity_extraction.txt", "entity_types": [ "organization", "person", "geo", "event" ], "max_gleanings": 1, "strategy": null }, "summarize_descriptions": { "llm": { "api_key": "REDACTED, length 10", "type": "openai_chat", "model": "glm4-9b", "max_tokens": 4000, "temperature": 0.0, "top_p": 1.0, "n": 1, "request_timeout": 180.0, "api_base": "http://192.168.24.137:30123/open-api/glm4/v1", "api_version": null, "proxy": null, "cognitive_services_endpoint": null, "deployment_name": null, "model_supports_json": true, "tokens_per_minute": 0, "requests_per_minute": 0, "max_retries": 10, "max_retry_wait": 10.0, "sleep_on_rate_limit_recommendation": true, "concurrent_requests": 25 }, "parallelization": { "stagger": 0.3, "num_threads": 50 }, "async_mode": "threaded", "prompt": "prompts/summarize_descriptions.txt", "max_length": 500, "strategy": null }, "community_reports": { "llm": { "api_key": "REDACTED, length 10", "type": "openai_chat", "model": "glm4-9b", "max_tokens": 4000, "temperature": 0.0, "top_p": 1.0, "n": 1, "request_timeout": 180.0, "api_base": "http://192.168.24.137:30123/open-api/glm4/v1", "api_version": null, "proxy": null, "cognitive_services_endpoint": null, "deployment_name": null, "model_supports_json": true, "tokens_per_minute": 0, "requests_per_minute": 0, "max_retries": 10, "max_retry_wait": 10.0, "sleep_on_rate_limit_recommendation": true, "concurrent_requests": 25 }, "parallelization": { "stagger": 0.3, "num_threads": 50 }, "async_mode": "threaded", "prompt": "prompts/community_report.txt", "max_length": 2000, "max_input_length": 8000, "strategy": null }, "claim_extraction": { "llm": { "api_key": "REDACTED, length 10", "type": "openai_chat", "model": "glm4-9b", "max_tokens": 4000, "temperature": 0.0, "top_p": 1.0, "n": 1, "request_timeout": 180.0, "api_base": "http://192.168.24.137:30123/open-api/glm4/v1", "api_version": null, "proxy": null, "cognitive_services_endpoint": null, "deployment_name": null, "model_supports_json": true, "tokens_per_minute": 0, "requests_per_minute": 0, "max_retries": 10, "max_retry_wait": 10.0, "sleep_on_rate_limit_recommendation": true, "concurrent_requests": 25 }, "parallelization": { "stagger": 0.3, "num_threads": 50 }, "async_mode": "threaded", "enabled": false, "prompt": "prompts/claim_extraction.txt", "description": "Any claims or facts that could be relevant to information discovery.", "max_gleanings": 1, "strategy": null }, "cluster_graph": { "max_cluster_size": 10, "strategy": null }, "umap": { "enabled": false }, "local_search": { "text_unit_prop": 0.5, "community_prop": 0.1, "conversation_history_max_turns": 5, "top_k_entities": 10, "top_k_relationships": 10, "temperature": 0.0, "top_p": 1.0, "n": 1, "max_tokens": 12000, "llm_max_tokens": 2000 }, "global_search": { "temperature": 0.0, "top_p": 1.0, "n": 1, "max_tokens": 12000, "data_max_tokens": 12000, "map_max_tokens": 1000, "reduce_max_tokens": 2000, "concurrency": 32 }, "encoding_model": "bge-m3-local", "skip_workflows": [] } 17:27:40,704 graphrag.index.create_pipeline_config INFO skipping workflows 17:27:40,704 graphrag.index.run INFO Running pipeline 17:27:40,705 graphrag.index.storage.file_pipeline_storage INFO Creating file storage at ragtest\output\20240801-172740\artifacts 17:27:40,705 graphrag.index.input.load_input INFO loading input from root_dir=input 17:27:40,705 graphrag.index.input.load_input INFO using file storage for input 17:27:40,706 graphrag.index.storage.file_pipeline_storage INFO search ragtest\input for files matching ..txt$ 17:27:40,707 graphrag.index.input.text INFO found text files from input, found [('老庄墨韩1.txt', {})] 17:27:40,738 graphrag.index.input.text INFO Found 1 files, loading 1 17:27:40,740 graphrag.index.workflows.load INFO Workflow Run Order: ['create_base_text_units', 'create_base_extracted_entities', 'create_summarized_entities', 'create_base_entity_graph', 'create_final_entities', 'create_final_nodes', 'create_final_communities', 'join_text_units_to_entity_ids', 'create_final_relationships', 'join_text_units_to_relationship_ids', 'create_final_community_reports', 'create_final_text_units', 'create_base_documents', 'create_final_documents'] 17:27:40,740 graphrag.index.run INFO Final # of rows loaded: 1 17:27:40,836 graphrag.index.run INFO Running workflow: create_base_text_units... 17:27:40,836 graphrag.index.run INFO dependencies for create_base_text_units: [] 17:27:40,840 datashaper.workflow.workflow INFO executing verb orderby 17:27:40,843 datashaper.workflow.workflow INFO executing verb zip 17:27:40,845 datashaper.workflow.workflow INFO executing verb aggregate_override 17:27:40,849 datashaper.workflow.workflow INFO executing verb chunk 17:28:01,895 datashaper.workflow.workflow ERROR Error executing verb "chunk" in create_base_text_units: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)')) Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connection.py", line 196, in _new_conn sock = connection.create_connection( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\util\connection.py", line 85, in create_connection raise err File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\util\connection.py", line 73, in create_connection sock.connect(sa) TimeoutError: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 789, in urlopen response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 490, in _make_request raise new_e File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 466, in _make_request self._validate_conn(conn) File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 1095, in _validate_conn conn.connect() File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connection.py", line 615, in connect self.sock = sock = self._new_conn() ^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connection.py", line 205, in _new_conn raise ConnectTimeoutError( urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\adapters.py", line 667, in send resp = conn.urlopen( ^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 843, in urlopen retries = retries.increment( ^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\util\retry.py", line 519, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\datashaper\workflow\workflow.py", line 410, in _execute_verb result = node.verb.func(**verb_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\verbs\text\chunk\text_chunk.py", line 98, in chunk output[to] = output.apply( ^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\frame.py", line 10374, in apply return op.apply().finalize(self, method="apply") ^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\apply.py", line 916, in apply return self.apply_standard() ^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\apply.py", line 1063, in apply_standard results, res_index = self.apply_series_generator() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\apply.py", line 1081, in apply_series_generator results[i] = self.func(v, *self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\verbs\text\chunk\text_chunk.py", line 101, in lambda x: run_strategy(strategy_exec, x[column], strategy_config, tick), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\verbs\text\chunk\text_chunk.py", line 128, in run_strategy strategy_results = strategy(texts, {**strategy_args}, tick) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\verbs\text\chunk\strategies\tokens.py", line 24, in run enc = tiktoken.get_encoding(encoding_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\tiktoken\registry.py", line 73, in get_encoding enc = Encoding(**constructor()) ^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\tiktoken_ext\openai_public.py", line 72, in cl100k_base mergeable_ranks = load_tiktoken_bpe( ^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\tiktoken\load.py", line 147, in load_tiktoken_bpe contents = read_file_cached(tiktoken_bpe_file, expected_hash) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\tiktoken\load.py", line 64, in read_file_cached contents = read_file(blobpath) ^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\tiktoken\load.py", line 25, in read_file resp = requests.get(blobpath) ^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\api.py", line 73, in get return request("get", url, params=params, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\api.py", line 59, in request return session.request(method=method, url=url, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\sessions.py", line 589, in request resp = self.send(prep, **send_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\sessions.py", line 703, in send r = adapter.send(request, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\adapters.py", line 688, in send raise ConnectTimeout(e, request=request) requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)')) 17:28:01,903 graphrag.index.reporting.file_workflow_callbacks INFO Error executing verb "chunk" in create_base_text_units: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)')) details=None 17:28:01,906 graphrag.index.run ERROR error running workflow create_base_text_units Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connection.py", line 196, in _new_conn sock = connection.create_connection( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\util\connection.py", line 85, in create_connection raise err File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\util\connection.py", line 73, in create_connection sock.connect(sa) TimeoutError: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 789, in urlopen response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 490, in _make_request raise new_e File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 466, in _make_request self._validate_conn(conn) File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 1095, in _validate_conn conn.connect() File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connection.py", line 615, in connect self.sock = sock = self._new_conn() ^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connection.py", line 205, in _new_conn raise ConnectTimeoutError( urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\adapters.py", line 667, in send resp = conn.urlopen( ^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\connectionpool.py", line 843, in urlopen retries = retries.increment( ^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\urllib3\util\retry.py", line 519, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\run.py", line 323, in run_pipeline result = await workflow.run(context, callbacks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\datashaper\workflow\workflow.py", line 369, in run timing = await self._execute_verb(node, context, callbacks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\datashaper\workflow\workflow.py", line 410, in _execute_verb result = node.verb.func(**verb_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\verbs\text\chunk\text_chunk.py", line 98, in chunk output[to] = output.apply( ^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\frame.py", line 10374, in apply return op.apply().finalize(self, method="apply") ^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\apply.py", line 916, in apply return self.apply_standard() ^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\apply.py", line 1063, in apply_standard results, res_index = self.apply_series_generator() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\pandas\core\apply.py", line 1081, in apply_series_generator results[i] = self.func(v, *self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\verbs\text\chunk\text_chunk.py", line 101, in lambda x: run_strategy(strategy_exec, x[column], strategy_config, tick), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\verbs\text\chunk\text_chunk.py", line 128, in run_strategy strategy_results = strategy(texts, {**strategy_args}, tick) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\graphrag\index\verbs\text\chunk\strategies\tokens.py", line 24, in run enc = tiktoken.get_encoding(encoding_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\tiktoken\registry.py", line 73, in get_encoding enc = Encoding(**constructor()) ^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\tiktoken_ext\openai_public.py", line 72, in cl100k_base mergeable_ranks = load_tiktoken_bpe( ^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\tiktoken\load.py", line 147, in load_tiktoken_bpe contents = read_file_cached(tiktoken_bpe_file, expected_hash) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\tiktoken\load.py", line 64, in read_file_cached contents = read_file(blobpath) ^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\tiktoken\load.py", line 25, in read_file resp = requests.get(blobpath) ^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\api.py", line 73, in get return request("get", url, params=params, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\api.py", line 59, in request return session.request(method=method, url=url, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\sessions.py", line 589, in request resp = self.send(prep, **send_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\sessions.py", line 703, in send r = adapter.send(request, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda\envs\GraphRAG\Lib\site-packages\requests\adapters.py", line 688, in send raise ConnectTimeout(e, request=request) requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x0000022E9B501750>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)')) 17:28:01,911 graphrag.index.reporting.file_workflow_callbacks INFO Error running pipeline! details=None

Additional Information

  • GraphRAG Version:0.2.0
  • Operating System:windows11
  • Python Version:3.11.9
  • Related Issues:

tyzhong avatar Aug 02 '24 03:08 tyzhong