.Net: Python: API for listing/filtering records without similarity search
Describe the bug
IVectorStoreRecordCollection does not have a method to list all the keys stored in the collection. So we are missing a way to compare the content of the collection with the source to find what keys are to be deleted (when they are no longer in the source data).
To Reproduce Steps to reproduce the behavior:
- Go to https://github.com/microsoft/semantic-kernel/blob/main/dotnet/src/Connectors/VectorData.Abstractions/VectorStorage/IVectorStoreRecordCollection.cs
- Observe that there are methods to get by key, delete by key, upsert by key, but no method to list all keys
Expected behavior
We would expect to have something like IVectorStoreRecordCollection.ListAsync. That would allow to find what keys need to be deleted.
Screenshots N/A
Platform
- OS: all
- IDE: N/A
- Language: C#
- Source: main branch of repository
Additional context N/A
We should also consider being able to control the sort order as part of this. To reliably page through an entire dataset, being able to control the sort order is valuable, so that sorting can be done on a field that will not change, causing record to be missed. This may of course not be supported by all VectorDBs so some analysis is required to validate whether this is feasible.
Seems like we also have #10295 tracking the same thing - am proposing we use this to track the Python side of the feature, and #10295 to track the .NET side.
we also have this already for python: #9911
@eavanvalkenburg ah, so maybe this can be closed?