graphrag
graphrag copied to clipboard
[Bug]: Unable to local query using latest main branch with error "FileNotFoundError: Table entity_description_embeddings does not exist."
Do you need to file an issue?
- [x] I have searched the existing issues and this bug is not already filed.
- [x] My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
- [x] I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.
Describe the bug
I've been using GraphRAG for two weeks now, and I updated to the latest branch in order to solve the issue that LLMs returns faulty responses on non JSON mode. However, I'm unable to local query correctly now. Considering the impact of past operations, I even cloned a new copy of the code and started building from scratch, but I still encountered the same issue. poetry run poe index and global query works fine, but I get an error when running local query. The error :
poetry run poe query --root . --method local "What is the service track?"
Poe => python -m graphrag.query --root . --method local 'What is the service track?'
INFO: Reading settings from settings.yaml
INFO: Vector Store Args: {}
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/Users/littleKitty/PycharmProjects/graphrag2/graphrag/query/__main__.py", line 83, in <module>
run_local_search(
File "/Users/littleKitty/PycharmProjects/graphrag2/graphrag/query/cli.py", line 162, in run_local_search
description_embedding_store = __get_embedding_description_store(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/littleKitty/PycharmProjects/graphrag2/graphrag/query/cli.py", line 75, in __get_embedding_description_store
description_embedding_store.db_connection.open_table(
File "/Users/littleKitty/Library/Caches/pypoetry/virtualenvs/graphrag-kyfzN3S0-py3.11/lib/python3.11/site-packages/lancedb/db.py", line 445, in open_table
return LanceTable.open(self, name, index_cache_size=index_cache_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/littleKitty/Library/Caches/pypoetry/virtualenvs/graphrag-kyfzN3S0-py3.11/lib/python3.11/site-packages/lancedb/table.py", line 937, in open
raise FileNotFoundError(
FileNotFoundError: Table entity_description_embeddings does not exist. Please first call db.create_table(entity_description_embeddings, data)
Steps to reproduce
No response
Expected Behavior
No response
GraphRAG Config Used
encoding_model: cl100k_base
skip_workflows: []
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_chat # or azure_openai_chat
model: gpt-4o-2024-05-13
model_supports_json: true # recommended if this is available for your model.
max_tokens: 4000
# request_timeout: 180.0
# api_base: https://<instance>.openai.azure.com
# api_version: 2024-02-15-preview
# organization: <organization_id>
# deployment_name: <azure_model_deployment_name>
# tokens_per_minute: 150_000 # set a leaky bucket throttle
# requests_per_minute: 10_000 # set a leaky bucket throttle
# max_retries: 10
# max_retry_wait: 10.0
# sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
# concurrent_requests: 25 # the number of parallel inflight requests that may be made
# temperature: 0 # temperature for sampling
# top_p: 1 # top-p sampling
# n: 1 # Number of completions to generate
parallelization:
stagger: 0.3
# num_threads: 50 # the number of threads to use for parallel processing
async_mode: threaded # or asyncio
embeddings:
## parallelization: override the global parallelization settings for embeddings
async_mode: threaded # or asyncio
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_embedding # or azure_openai_embedding
model: text-embedding-ada-002
# api_version: 2024-02-15-preview
# organization: <organization_id>
# deployment_name: <azure_model_deployment_name>
# tokens_per_minute: 150_000 # set a leaky bucket throttle
# requests_per_minute: 10_000 # set a leaky bucket throttle
# max_retries: 10
# max_retry_wait: 10.0
# sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
# concurrent_requests: 25 # the number of parallel inflight requests that may be made
# batch_size: 16 # the number of documents to send in a single request
# batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
# target: required # or optional
Additional Information
- GraphRAG Version: 24/8/2 main branch
- Operating System: macos
- Python Version: 3.11
- Related Issues:
thanks for help