graphrag [Bug]:query error JSONDecodeError after import more text. No error when less text

Is there an existing issue for this?

[X] I have searched the existing issues
[X] I have checked #657 to validate if my issue is covered by community support

Describe the bug

Import one books graphrag initialization and quering all succeeded. Import bigger txt files ,initialize succeeded but quering report json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0）

Error parsing search response json Traceback (most recent call last): File "/home/xx/anaconda3/envs/graph2/lib/python3.12/site-packages/graphrag/query/structured_search/global_search/search.py", line 194, in _map_response_single_batch processed_response = self.parse_search_response(search_response) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/site-packages/graphrag/query/structured_search/global_search/search.py", line 232, in parse_search_response parsed_elements = json.loads(search_response)["points"] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/init.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Steps to reproduce

LLM model:glm4-chat Embedding model:bce-embedding-base-v1

import this book book.txt

impor these books , quering reports JSONDecodeError 1804.07821v1.txt 2101.03961v3.txt thinkos.txt

Expected Behavior

No response

GraphRAG Config Used

encoding_model: cl100k_base skip_workflows: [] llm: api_key: ${GRAPHRAG_API_KEY} type: openai_chat model: glm4-chat-test model_supports_json: false

api_base: http://127.0.0.1:9997/v1

parallelization: stagger: 0.3

async_mode: threaded

embeddings:

async_mode: threaded llm: api_key: ${GRAPHRAG_API_KEY} type: openai_embedding model: bce-embedding-basev1 api_base: http://127.0.0.1:9998/v1

chunks: size: 300 overlap: 100 group_by_columns: [id]

input: type: file file_type: text base_dir: "input" file_encoding: utf-8 file_pattern: ".*\.txt$"

cache: type: file base_dir: "cache"

storage: type: file base_dir: "output/${timestamp}/artifacts"

reporting: type: file base_dir: "output/${timestamp}/reports"

entity_extraction:

prompt: "prompts/entity_extraction.txt" entity_types: [organization,person,geo,event] max_gleanings: 0

summarize_descriptions:

prompt: "prompts/summarize_descriptions.txt" max_length: 500

claim_extraction:

prompt: "prompts/claim_extraction.txt" description: "Any claims or facts that could be relevant to information discovery." max_gleanings: 0

community_report:

prompt: "prompts/community_report.txt" max_length: 2000 max_input_length: 8000

cluster_graph: max_cluster_size: 10

embed_graph: enabled: false

umap: enabled: false

snapshots: graphml: false raw_entities: false top_level_nodes: false

local_search:

global_search:

Logs and screenshots

No response

Additional Information

GraphRAG Version:0.1.1
Operating System:WLS2 Ubuntu 22.04
Python Version:3.12.4
Related Issues:

Jul 26 '24 17:07 goodmaney

Is there an existing issue for this?

[x] I have searched the existing issues

[x] I have checked #657 to validate if my issue is covered by community support

Describe the bug

Import one books graphrag initialization and quering all succeeded. Import bigger txt files ,initialize succeeded but quering report json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0）

Error parsing search response json Traceback (most recent call last): File "/home/xx/anaconda3/envs/graph2/lib/python3.12/site-packages/graphrag/query/structured_search/global_search/search.py", line 194, in _map_response_single_batch processed_response = self.parse_search_response(search_response) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/site-packages/graphrag/query/structured_search/global_search/search.py", line 232, in parse_search_response parsed_elements = json.loads(search_response)["points"] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/init.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Steps to reproduce

LLM model:glm4-chat Embedding model:bce-embedding-base-v1

import this book book.txt

impor these books , quering reports JSONDecodeError 1804.07821v1.txt 2101.03961v3.txt thinkos.txt

Expected Behavior

No response

GraphRAG Config Used

encoding_model: cl100k_base skip_workflows: [] llm: api_key: ${GRAPHRAG_API_KEY} type: openai_chat model: glm4-chat-test model_supports_json: false

api_base: http://127.0.0.1:9997/v1

parallelization: stagger: 0.3

async_mode: threaded

embeddings:

async_mode: threaded llm: api_key: ${GRAPHRAG_API_KEY} type: openai_embedding model: bce-embedding-basev1 api_base: http://127.0.0.1:9998/v1

chunks: size: 300 overlap: 100 group_by_columns: [id]

input: type: file file_type: text base_dir: "input" file_encoding: utf-8 file_pattern: ".*.txt$"

cache: type: file base_dir: "cache"

storage: type: file base_dir: "output/${timestamp}/artifacts"

reporting: type: file base_dir: "output/${timestamp}/reports"

entity_extraction:

prompt: "prompts/entity_extraction.txt" entity_types: [organization,person,geo,event] max_gleanings: 0

summarize_descriptions:

prompt: "prompts/summarize_descriptions.txt" max_length: 500

claim_extraction:

prompt: "prompts/claim_extraction.txt" description: "Any claims or facts that could be relevant to information discovery." max_gleanings: 0

community_report:

prompt: "prompts/community_report.txt" max_length: 2000 max_input_length: 8000

cluster_graph: max_cluster_size: 10

embed_graph: enabled: false

umap: enabled: false

snapshots: graphml: false raw_entities: false top_level_nodes: false

local_search:

global_search:

Logs and screenshots

No response

Additional Information

GraphRAG Version:0.1.1

Operating System:WLS2 Ubuntu 22.04

Python Version:3.12.4

Related Issues:

Hi, did you solve the question? I found that work for me: search prompt But seems like you aren't using ollama, you can still give it a try! Hope it help you.

Aug 03 '24 06:08 BaronHsu

Is there an existing issue for this?

[x] I have searched the existing issues

[x] I have checked #657 to validate if my issue is covered by community support

Describe the bug

Import one books graphrag initialization and quering all succeeded. Import bigger txt files ,initialize succeeded but quering report json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0） Error parsing search response json Traceback (most recent call last): File "/home/xx/anaconda3/envs/graph2/lib/python3.12/site-packages/graphrag/query/structured_search/global_search/search.py", line 194, in _map_response_single_batch processed_response = self.parse_search_response(search_response) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/site-packages/graphrag/query/structured_search/global_search/search.py", line 232, in parse_search_response parsed_elements = json.loads(search_response)["points"] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/init.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Steps to reproduce

LLM model:glm4-chat Embedding model:bce-embedding-base-v1 import this book book.txt impor these books , quering reports JSONDecodeError 1804.07821v1.txt 2101.03961v3.txt thinkos.txt

Expected Behavior

No response

GraphRAG Config Used

encoding_model: cl100k_base skip_workflows: [] llm: api_key: ${GRAPHRAG_API_KEY} type: openai_chat model: glm4-chat-test model_supports_json: false api_base: http://127.0.0.1:9997/v1 parallelization: stagger: 0.3 async_mode: threaded embeddings: async_mode: threaded llm: api_key: ${GRAPHRAG_API_KEY} type: openai_embedding model: bce-embedding-basev1 api_base: http://127.0.0.1:9998/v1 chunks: size: 300 overlap: 100 group_by_columns: [id] input: type: file file_type: text base_dir: "input" file_encoding: utf-8 file_pattern: ".*.txt$" cache: type: file base_dir: "cache" storage: type: file base_dir: "output/${timestamp}/artifacts" reporting: type: file base_dir: "output/${timestamp}/reports" entity_extraction: prompt: "prompts/entity_extraction.txt" entity_types: [organization,person,geo,event] max_gleanings: 0 summarize_descriptions: prompt: "prompts/summarize_descriptions.txt" max_length: 500 claim_extraction: prompt: "prompts/claim_extraction.txt" description: "Any claims or facts that could be relevant to information discovery." max_gleanings: 0 community_report: prompt: "prompts/community_report.txt" max_length: 2000 max_input_length: 8000 cluster_graph: max_cluster_size: 10 embed_graph: enabled: false umap: enabled: false snapshots: graphml: false raw_entities: false top_level_nodes: false local_search: global_search:

Logs and screenshots

No response

Additional Information

GraphRAG Version:0.1.1

Operating System:WLS2 Ubuntu 22.04

Python Version:3.12.4

Related Issues:

Hi, did you solve the question? I found that work for me: search prompt But seems like you aren't using ollama, you can still give it a try! Hope it help you.

Thanks, but not work. I change model to llama3.1 set temperature 0.3, it response some content after reporting json error, but not stable, if I initialize again maybe it not response just error .

Aug 03 '24 12:08 goodmaney

Hello！ Have you solve it now? We I use text-embedding-3-large to generate_text_embeddings, I encountered the same errors and I wonder how to solve it

Dec 22 '24 10:12 gudehhh666