llama_index icon indicating copy to clipboard operation
llama_index copied to clipboard

[Bug]: weaviate client has no collection attributes

Open arsyad2281 opened this issue 1 year ago • 7 comments

Bug Description

I am initializing WeaviateVectorStore based on the example given in this link

However, I am getting this error: AttributeError: 'Client' object has no attribute 'collections'

Python Version - 3.11.9

Version of llama index and relevant packages llama-index==0.10.37 llama-index-vector-stores-weaviate==1.0.0

Version

0.10.37

Steps to Reproduce

Please replace <your-username>, <your-password> and <your-index-name> accordingly. WeaviateDB is locally hosted through docker with this image: semitechnologies/weaviate:1.23.9

from llama_index.vector_stores.weaviate import WeaviateVectorStore
import weaviate

weaviate_url = "http://localhost:8080"

resource_owner_config = weaviate.AuthClientPassword(
    username=<your-username>,
    password=<your-password>,
)

client = weaviate.Client(
    weaviate_url,
    auth_client_secret=resource_owner_config,
)

vector_store = WeaviateVectorStore(
    weaviate_client=client, index_name=<your-index-name>
)

Relevant Logs/Tracbacks

AttributeError                            Traceback (most recent call last)
Cell In[22], line 1
----> 1 vector_store = WeaviateVectorStore(
      2     weaviate_client=client, index_name="Ccassistant"
      3 )

File ~/homebrew/Caskroom/miniconda/base/envs/test_env/lib/python3.11/site-packages/llama_index/vector_stores/weaviate/base.py:172, in WeaviateVectorStore.__init__(self, weaviate_client, class_prefix, index_name, text_key, auth_config, client_kwargs, url, **kwargs)
    167     raise ValueError(
    168         "Index name must start with a capital letter, e.g. 'LlamaIndex'"
    169     )
    171 # create default schema if does not exist
--> 172 if not class_schema_exists(self._client, index_name):
    173     create_default_schema(self._client, index_name)
    175 super().__init__(
    176     url=url,
    177     index_name=index_name,
   (...)
    180     client_kwargs=client_kwargs or {},
    181 )

File ~/homebrew/Caskroom/miniconda/base/envs/test_env/lib/python3.11/site-packages/llama_index/vector_stores/weaviate/utils.py:76, in class_schema_exists(client, class_name)
     74 """Check if class schema exists."""
     75 validate_client(client)
---> 76 return client.collections.exists(class_name)

AttributeError: 'Client' object has no attribute 'collections'

arsyad2281 avatar May 21 '24 08:05 arsyad2281

The error AttributeError: 'Client' object has no attribute 'collections' occurs because the weaviate.Client object does not have a collections attribute. The correct attribute to use for schema-related operations is schema.

To resolve this issue, update the code to use the schema attribute instead of collections. Here is the corrected code snippet for initializing the WeaviateVectorStore and checking the class schema:

  1. Update the class_schema_exists function to use the schema attribute:
def class_schema_exists(client: weaviate.Client, class_name: str) -> bool:
    """Check if class schema exists."""
    validate_client(client)
    return client.schema.exists(class_name)
  1. Ensure that the delete_index method in the WeaviateVectorStore class uses the schema attribute:
def delete_index(self) -> None:
    """Delete the index associated with the client.

    Raises:
    - Exception: If the deletion fails, for some reason.
    """
    if not class_schema_exists(self._client, self.index_name):
        _logger.warning(
            f"Index '{self.index_name}' does not exist. No action taken."
        )
        return
    try:
        self._client.schema.delete_class(self.index_name)
        _logger.info(f"Successfully deleted index '{self.index_name}'.")
    except Exception as e:
        _logger.error(f"Failed to delete index '{self.index_name}': {e}")
        raise Exception(f"Failed to delete index '{self.index_name}': {e}")
  1. Ensure that your initialization code follows the correct structure:
from llama_index.vector_stores.weaviate import WeaviateVectorStore
import weaviate

weaviate_url = "http://localhost:8080"

resource_owner_config = weaviate.AuthClientPassword(
    username="<your-username>",
    password="<your-password>",
)

client = weaviate.Client(
    weaviate_url,
    auth_client_secret=resource_owner_config,
)

vector_store = WeaviateVectorStore(
    weaviate_client=client, index_name="<your-index-name>"
)

Additionally, ensure that you are using the correct versions of the dependencies:

  • llama-index-core version ^0.10.1
  • weaviate-client version ^4.5.7

You can install the required packages using:

pip install llama-index-vector-stores-weaviate weaviate-client

This should resolve the issue and allow you to use the WeaviateVectorStore as intended.

Details

Sources


About Dosu This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

dosubot[bot] avatar May 21 '24 08:05 dosubot[bot]

Do note that collection is also called in this function - link, where collection is expected to be an object in L328 :

def query(self, query: VectorStoreQuery, **kwargs: Any) -> VectorStoreQueryResult:
        """Query index for top k most similar nodes."""
        all_properties = get_all_properties(self._client, self.index_name)
        collection = self._client.collections.get(self.index_name)
        filters = None
        ...
        ...
        # execute query
        try:
            query_result = collection.query.hybrid(
                query=query.query_str,
                vector=vector,
                alpha=alpha,
                limit=limit,
                filters=filters,
                return_metadata=return_metatada,
                return_properties=all_properties,
                include_vector=True,
            )
        except weaviate.exceptions.WeaviateQueryError as e:
            raise ValueError(f"Invalid query, got errors: {e.message}")
        ...
        ...

arsyad2281 avatar May 21 '24 09:05 arsyad2281

I'm also hitting this with dependencies: llama-index-core=0.10.38.post1, llama-index-vector-stores-weaviate=1.0.0, weaviate-client=4.6.3.

chrisk314 avatar May 22 '24 11:05 chrisk314

Looks like this is a user error, apologies. I was attempting to use a weaviate v3 client (weaviate.Client) where LlamaIndex is expecting a weaviate v4 client (weaviate.WeaviateClient). Using the below code to create a client for a local weaviate instance is working fine. I'd suggest this can be closed.

from llama_index.vector_stores.weaviate import WeaviateVectorStore
import weaviate

client = weaviate.connect_to_local()
vector_store = WeaviateVectorStore(weaviate_client=client)

chrisk314 avatar May 23 '24 18:05 chrisk314

@arsyad2281 from your issue description, it looks like you made the same mistake I did. Change your code as per my comment above and you should be good.

chrisk314 avatar May 23 '24 18:05 chrisk314

On closer inspection I realised that the WeaviateVectorStore.from_params method was creating a Weaviate V3 client, where the rest of the code is expecting a Weaviate V4 client. I made a PR which changes all the code to require and use Weaviate V4 clients. This choice makes the updated code consistent and error free with respect to the rest of the existing code; and ensures that the code is future proofed by migrating to Weaviate V4 now.

chrisk314 avatar May 24 '24 10:05 chrisk314

I just ran into a similar issue, where I try to ingest modified files into my Weaviate database, so basically doing an upsert.

I think it's caused by the base.py code using mixed V3 and V4 code too.

The error I get:

Traceback (most recent call last):
  File "/foo/bar.py", line 56, in <module>
    nodes = pipeline.run(documents=documents)
  File "/home/krisz/.pyenv/versions/3.10.13/lib/python3.10/site-packages/llama_index/core/ingestion/pipeline.py", line 682, in run
    nodes_to_run = self._handle_upserts(
  File "/home/krisz/.pyenv/versions/3.10.13/lib/python3.10/site-packages/llama_index/core/ingestion/pipeline.py", line 612, in _handle_upserts
    self.vector_store.delete(ref_doc_id)
  File "/home/krisz/.pyenv/versions/3.10.13/lib/python3.10/site-packages/llama_index/vector_stores/weaviate/base.py", line 261, in delete
    self._client.query.get(self.index_name)
AttributeError: 'WeaviateClient' object has no attribute 'query'

The relevant part in base.py:

query = (
            self._client.query.get(self.index_name)
            .with_additional(["id"])
            .with_where(where_filter)
            .with_limit(10000)  # 10,000 is the max weaviate can fetch
        )

The _client.query doesn't exist in V4 anymore, I think.

krisz094 avatar May 24 '24 12:05 krisz094

@krisz094 thanks for pointing that out. I've pushed a commit dc8f15e to #13719 which fixes the WeaviateVectorStore.delete method. Just tested manually and it's working now. I didn't test it out with additional delete_kwargs["filters"] yet.

Weaviate and LlamaIndex code bases are both very new to me; I've based my fix for the delete method off code that's in the query method and what I can piece together from the Weaviate code and docs. I'm pushed for time right now and it's a long weekend in the UK. I'll be able to look more in depth at the rest of the WeaviateVectorStore code on Tuesday and test further.

chrisk314 avatar May 24 '24 17:05 chrisk314

Just following up on this... I've now manually tested all public methods of the WeaviateVectorStore with the updates in the PR and all looks good to me.

Any timeline for a review from any of the core contributors?

chrisk314 avatar May 28 '24 13:05 chrisk314

#13365 should fix delete method @chrisk314

brenkehoe avatar Jun 03 '24 09:06 brenkehoe

@brenkehoe ok thanks for pointing that out. I've now merged the latest changes from main into #13719 taking the delete implementation from main.

chrisk314 avatar Jun 03 '24 10:06 chrisk314