graphrag icon indicating copy to clipboard operation
graphrag copied to clipboard

Clean the string and extract only the JSON part

Open ForestLinSen opened this issue 1 year ago • 1 comments

Description

This JSON parsing error in the global search is mainly because, for some open-source LLMs, the response is not strictly JSON format. For example, when asking about the main theme of the story, Llama3 would give an answer like this:

Here is a response consisting of a list of key points that summarizes the top themes in the provided data:

{"points": [    {"description": "The theme of community and social dynamics is prominent, with the Men's Elations and Sorrows community revolving around men experiencing elations and sorrows. [Data: Reports (1)]", "score": 80},    {"description": "The potential for threat or conflict is a significant theme, with Harmony Assembly's march at Verdant Oasis Plaza being a potential source of threat. [Data: Reports (6), Relationships (38, 43)]", "score": 70}}

Note that these scores are subjective and based on my interpretation of the data provided.

This is not a valid JSON format due to the additional content before and after the expected JSON.

Related Issues

[Issue]: All workflows completed successfully,but graphrag failed to answer any question given the provided data

Proposed Changes

  • Added a regular expression to extract JSON content from the search response string

Checklist

  • [x] I have tested these changes locally.
  • [x] I have reviewed the code changes.
  • [x] I have updated the documentation (if necessary).
  • [x] I have added appropriate unit tests (if applicable).

Additional Notes

ForestLinSen avatar Jul 20 '24 11:07 ForestLinSen

Can we get this merged, otherwise almost every local LLM fails with:

SUCCESS: Global Search Response: I am sorry but I am unable to answer this question given the provided data.

(like seen in #575)

awaescher avatar Jul 26 '24 08:07 awaescher

We have resolved several issues related to text encoding and JSON parsing that are rolled up into version 0.2.2. Please try again with that version and re-open if this is still an issue.

natoverse avatar Aug 09 '24 17:08 natoverse