rig
rig copied to clipboard
refactor: Add generic type to `VectorStoreIndex` trait
- [x] I have looked for existing issues (including closed) about this
Feature Request
Refactor the VectorStoreIndex
trait to add a generic type representing the type documents stored in the store. This would remove the generic type of the top_n
method.
Motivation
This goal of this change is to improve the developer experience while working with vector stores. Specifically, it solves the problem where developers have to define the type associated with a vector store twice. For instance, with the InMemoryVectorStore
, which is itself already parametrized by some generic type D
, the type T
of the top_n
implementation cannot be inferred where it should in fact be the same as the type D
of the store! A similar situation occurs with the MongoDbVectorStore
, which takes as constructor argument a Collection<T>
, which implies that the return type of the top_n
method is also T
(currently you have to define it twice).
Proposal
Refactor the VectorStoreIndex
trait like so:
pub trait VectorStoreIndex<T: for<'a> Deserialize<'a>>: Send + Sync {
/// Get the top n documents based on the distance to the given query.
/// The result is a list of tuples of the form (score, id, document)
fn top_n(
&self,
query: &str,
n: usize,
) -> impl std::future::Future<Output = Result<Vec<(f64, String, T)>, VectorStoreError>> + Send;
/// Same as `top_n` but returns the document ids only.
fn top_n_ids(
&self,
query: &str,
n: usize,
) -> impl std::future::Future<Output = Result<Vec<(f64, String)>, VectorStoreError>> + Send;
}