chroma icon indicating copy to clipboard operation
chroma copied to clipboard

[ENH] Add python & js client support to query on subset of IDs

Open jairad26 opened this issue 6 months ago • 6 comments

Description of changes

This PR adds python client and async python client support to query on a filtered set of IDs

example:

ids = ["1", "2", "3"]
documents = ["test", "test2", "apple"]
metadatas = [{"source": "test"}, {"source": "test2"}, {"source": "apple"}]

coll.add(
    ids=ids,
    documents=documents,
    metadatas=metadatas,
    # embeddings=numpy_embeddings
)

output = coll.query(
    ids=["1", "3"],
    query_texts=["test"],
    n_results=3,
    include=["documents", "metadatas", "distances"]
)

print(output)

This will output % python test_filter_id.py {'ids': [['1', '3']], 'embeddings': None, 'documents': [['test', 'apple']], 'uris': None, 'included': ['documents', 'metadatas', 'distances'], 'data': None, 'metadatas': [[{'source': 'test'}, {'source': 'apple'}]], 'distances': [[0.0, 0.7396076321601868]]}

Test plan

How are these changes tested?

  • [x] Tests pass locally with pytest for python, yarn test for js, cargo test for rust
  • Added prop tests to test with other filtering and on its own

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs repository?

jairad26 avatar Apr 09 '25 21:04 jairad26