llama_index icon indicating copy to clipboard operation
llama_index copied to clipboard

Unable to query the Weaviate Index using RaptorRetriever and RetrieverQueryEngine Modules[Bug]:

Open udayvakalapudi opened this issue 10 months ago • 2 comments

Bug Description

Packages

python = "^3.11" llama-index = "^0.10.28" llama-index-embeddings-huggingface = "^0.2.0" llama-index-llms-huggingface = "^0.1.4" torch = {version = "^2.2.2+cu121", source = "pytorch-gpu"} torchvision = {version = "^0.17.2+cu121", source = "pytorch-gpu"} torchaudio = {version = "^2.2.2+cu121", source = "pytorch-gpu"} llama-index-vector-stores-weaviate = "^0.1.4" ray = {extras = ["data", "serve"], version = "^2.10.0"} llama-index-packs-raptor = "^0.1.3" llama-index-llms-ollama = "^0.1.2" llama-index-embeddings-ollama = "^0.1.2" umap-learn = "^0.5.6"

Error Details {'data': {'Get': {'RaptorIndex': None}}, 'errors': [{'locations': [{'column': 6, 'line': 1}], 'message': 'invalid \'where\' filter: data type filter cannot use "valueInt" on type "number", use "valueNumber" instead', 'path': ['Get', 'RaptorIndex']}]}

Version

^0.10.28

Steps to Reproduce

Please run the below code with Weaviate running on the back ground.

Code:

LLM_MODEL_NAME = os.getenv('LLM_MODEL_NAME', 'gemma:2b') EMBEDDINGS_MODEL_NAME = os.getenv('EMBEDDINGS_MODEL_NAME', 'nomic-embed-text')

embed_model = OllamaEmbedding( model_name=EMBEDDINGS_MODEL_NAME )

os.environ["OPENAI_API_KEY"] = "NA"

llm_model = Ollama(model=LLM_MODEL_NAME, request_timeout=400.0)

vdb_client = weaviate.Client(url="http://localhost:8080")

vector_store = WeaviateVectorStore(weaviate_client=vdb_client, index_name="RaptorIndex", text_key="text")

setting up the storage for the embeddings

storage_context = StorageContext.from_defaults(vector_store=vector_store)

vdb_index = VectorStoreIndex.from_documents(documents=[], storage_context=storage_context)

documents = SimpleDirectoryReader(input_files=["documents/raptor_paper.pdf"]).load_data()

raptor_pack = RaptorPack( documents, embed_model=embed_model, # used for embedding clusters llm=llm_model, # used for generating summaries vector_store=vector_store, # used for storage similarity_top_k=2, # top k for each layer, or overall top-k for collapsed mode="collapsed", # sets default mode transformations=[ SentenceSplitter(chunk_size=400, chunk_overlap=50) ], # transformations applied for ingestion )

nodes = raptor_pack.run("What baselines is raptor compared against?", mode="collapsed")

retriever = RaptorRetriever( [], embed_model=embed_model, # used for embedding clusters llm=llm_model, # used for generating summaries vector_store=vector_store, # used for storage similarity_top_k=2, # top k for each layer, or overall top-k for collapsed mode="tree_traversal", # sets default mode )

query_engine = RetrieverQueryEngine.from_args( retriever, llm=llm_model )

response = query_engine.query("What baselines was RAPTOR compared against?")

Relevant Logs/Tracbacks

{'data': {'Get': {'RaptorIndex': None}}, 'errors': [{'locations': [{'column': 6, 'line': 1}], 'message': 'invalid \'where\' filter: data type filter cannot use "valueInt" on type "number", use "valueNumber" instead', 'path': ['Get', 'RaptorIndex']}]}

udayvakalapudi avatar Apr 23 '24 05:04 udayvakalapudi

The error you're encountering is due to a mismatch in the expected data type for the 'where' filter in your query to the Weaviate index. The filter is attempting to use "valueInt" on a field that is of type "number", which requires the use of "valueNumber" instead. This issue arises from the way data types are handled when applying filters in the LlamaIndex integration with Weaviate.

To resolve this, ensure that when you're applying filters in your queries, integers and floats are correctly represented using "valueNumber". This adjustment is necessary because Weaviate expects numerical values to be specified with "valueNumber", regardless of whether they are integers or floats. Here's a quick guide on how to adjust your filter application:

  • For float values, continue using "valueNumber".
  • For integer values, also use "valueNumber" instead of "valueInt".
  • If you're dealing with strings that represent numeric values, convert them to floats and use "valueNumber".

This change should align with the data types of the fields in your Weaviate index and the expectations of the LlamaIndex queries. If after making these adjustments you still face issues, consider verifying that your LlamaIndex and Weaviate versions are compatible and up to date. Upgrading to the latest version of LlamaIndex might also introduce fixes and improvements that could help resolve this issue.

Details

Sources


About Dosu This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

dosubot[bot] avatar Apr 23 '24 05:04 dosubot[bot]

Where should I make this change? Please provide detailed steps.

I have implement this using the below link, but Weaviate VDB.

https://github.com/run-llama/llama_index/blob/main/llama-index-packs/llama-index-packs-raptor/examples/raptor.ipynb

udayvakalapudi avatar Apr 23 '24 06:04 udayvakalapudi