graphrag
graphrag copied to clipboard
[Bug]: <title>Incorrect weight assignment due to type mismatch in graph_extractor.py
Do you need to file an issue?
- [X] I have searched the existing issues and this bug is not already filed.
- [X] My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
- [X] I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.
Describe the bug
I've encountered a bug in the graphrag/index/graph/extractors/graph/graph_extractor.py
file around line 253:
https://github.com/microsoft/graphrag/blob/main/graphrag/index/graph/extractors/graph/graph_extractor.py#L253
weight = (
float(record_attributes[-1])
if isinstance(record_attributes[-1], numbers.Number)
else 1.0
)
The problem arises because the LLM generates results as strings (str), while isinstance(xxx, numbers.Number) requires numeric types.
Here's a simple test case demonstrating the issue:
from numbers import Number
test_values = [1, 1.0, "1.0"]
def is_number(value):
return isinstance(value, Number)
for value in test_values:
print(f"{value}: is number={is_number(value)}, is float={isinstance(value, float)}")
# Expected output:
# 1: is number=True, is float=False
# 1.0: is number=True, is float=True
# "1.0": is number=False, is float=False
As shown, for string representations of numbers generated by the LLM, isinstance
always returns False
, leading to all weights being assigned a default value of 1.0
.
I would appreciate it if this could be addressed and fixed in future versions. Thank you!
Steps to reproduce
During the index process.
python -m graphrag.index --root ./ragtest
Expected Behavior
The function isinstance
should return True
GraphRAG Config Used
default, use openai key.
Logs and screenshots
Additional Information
- GraphRAG Version: 0.2.0 and 0.1.1