Vector DB: Add support for Neo4j
Expand Vector DB support to include Neo4j.
Milvus and Pinecone implementations in the embeddings.py can be used as a guide.
Scope includes: Updating LLMWareConfig, EmbeddingHandler Creating an Embedding class specific to Neo4j Test scripts and example code
Hi all,
I'm working on this issue and would submit a pull request soon, roughly within the next week. Can someone please assign me to this issue so no one else wastes time on it? Thank you!
Hi MacOS - appreciate your contribution - BTW, please check the updated EmbeddingHandler and embedding classes - there have been some recent updates in terms of support for other vector DBs - should provide a good template and starting point for Neo4j ... Hope things are progressing well!
Of course! Happy to help out.
I started, but could not finish in time before Christmas because of a busy week. Depending on how busy the holidays are, I will come back to it either before or after new years.
I will open a pull request once it is good enough for a first look.
Just a quick heads up: I started to work on it, and I'm 70-80% done with it.
I have a question.
Neo4j support vector indexes since 5.11. I plan to check which version is running and raise a ValueError in case this is not true, similar to what is done here. Is this ok?
@MacOS - sorry for the slow reply - yes, totally OK to throw an error if unsupported db - please look at Exceptions.py which defines a set of exceptions by category that you can raise, e.g., UnsupportedDB, etc.
Also, please check the repository again in the next 24-48 hours - we are making some improvements in the collection DB side (e.g, Mongo), which will have some conforming changes and refactoring in the Embedding classes - should be easy to follow - but it will provide a set of common utility for some of the boilerplate parts of creating a new EmbeddingDB .
Please keep us posted - and let us know if you have any questions or run into any issues!
@doberst No problem :smiley:
Exception
I went through them, but I did not found anything that is suitable. The one you suggested, UnsupportedEmbeddingDatabaseException does not seem to fit, seems we will support Neo4j - we support versions that have the vector index feature. Is this really the intended use case for this exception? By reading the name it seems to me that it does not fit semantically. You know what I mean?
Checking the repository
I will!
Other issues
I found an issue while testing. In particular, I get TypeError: EmbeddingHandler.delete_index() got an unexpected keyword argument 'model'. This is caused by this line
https://github.com/llmware-ai/llmware/blob/e694e94f3cac31f357089ef4b1897d408917196e/tests/embeddings/test_embeddings.py#L68
According to the implementation of delete_index
https://github.com/llmware-ai/llmware/blob/e694e94f3cac31f357089ef4b1897d408917196e/llmware/embeddings.py#L118
model has to be changed to model_name. Am I correct with this? I'm currently only testing my unit test, and not the other ones. If I got this right, then all other vector DB tests should fail at the same call - delete_index.
Progress
I'm currently 80-90% done. The examples and the docker set up for testing is currently missing. The last part is, by the way, currently missing from the to do list here - and also from #17.
Hi guys,
i just opened a pull-request - #310 . Please have a look. I should address everything @JessBerl listed, plus the docker image for Neo4j. However, this is not 100% done yet because I'm unsure how you want to handle that.
@MacOS - GREAT JOB - BIG LIFT .... Thank you. 🙏 🙏 🙏 ... Your code: 🔥 🔥 🔥 .... Keep the good stuff coming!
Thank you, @doberst! Happy to help out, and contribute again. I already offered my help for #17.