langchain
langchain copied to clipboard
Allow use of non-default QDrant vector key
- Context #2594
- This is continuation of #2751 which diverged so much from origin and was better to be re-started.
Fixes #2594
Who can review?
@kacperlukawski
Hello, @kacperlukawski, I'm back on this and I would like to share my thought process:
- Since last time I looked at this file in code-base, a lot has changed, and so I began with re-implementing the first part I started with.
- Then, I began on the other part which I didn't implement which is allowing defining multiple vectors at time of injecting documents into QDrant (
from_textsclass method). - Then, I realised defining
vector_keyat class level preventslangchainusers from using different vectors in collection as all queries will now usevector_keyfrom class, so I movedvector_keytosimilarity_searchmethod. - With that, I replaced
vector_keyfrom class constructor tovectors_config, and used it to check for its truthy value insimilarity_searchin order to raise either of twoValueErrorobjects;QDrantclass was initialised withvectors_configbut value ofvector_keywasn't provided, orvector_keyis provided but it isn't one of the vectors defined invectors_config.
With those changes:
- Users of
QDrantvector store, can now re-use their current collections with the current data, even if it was originally created outside oflangchain, with any of the vectors collection was defined with. - It is possible to create a collection
from_documentswith multiple vectors now. - Original behaviour is preserved.
That said, I'm yet to figure out the effect of this when QDrant class is in chain, as I just made this and tested it as a standalone object, not in a chain.
Am I on a path that is accepted by the project standards?
@kacperlukawski, any chance you had a look at this?
@mahmoudajawad I had an initial look, but I want to cover it with some integration tests. I can hopefully do it over the weekend.
@mahmoudajawad I tried to cover the PR with some integration tests, but unfortunately, I found some issues hard to solve. I'll start a new PR to bring that functionality.
@kacperlukawski, I really would like to contribute this. I just need directions since I don't want to just put my code in the codebase without pre-approval of an acceptable approach.
If you could confirm to me that my approach in general is acceptable, I would re-create this PR again, add necessary integration tests, and submit it all again.
@mahmoudajawad I just started a new PR that exposes just a vector name: #6871. Would appreciate your review on that.
Closing this in favour of https://github.com/hwchase17/langchain/pull/6871