chroma icon indicating copy to clipboard operation
chroma copied to clipboard

[Bug]: Removal of metadata key

Open tazarov opened this issue 1 year ago • 4 comments

What happened?

mikekk — Yesterday (14-Jan-24) at 9:32 PM
Hello everyone, I am able to add  new  keys  to the metadatas and their associated values, but once I 've added a new key to the collection metadatas, using the  collection.update, I can only change its value, can't delete anymore  the key from metadatas. Why that happens?  Is it a bug or a wanted behaviour? Thanks. 

https://discord.com/channels/1073293645303795742/1191351965230313472/1195812744188932136

Versions

Chroma 0.4.22, Colab, MacOS

Relevant log output

import chromadb
client = chromadb.PersistentClient() #this is in-memory client, adjust as per your needs
collection = client.get_or_create_collection("mytest")
collection.add(ids=["id1"],documents=["document 1"],metadatas=[{"key_to_keep":1,"key_to_remove":2}])
records = collection.get(ids=["id1"])
print(records["metadatas"][0])
# {'key_to_keep': 1, 'key_to_remove': 2}
del records["metadatas"][0]["key_to_remove"] #remove the unnecessary key
print(records)
# {'ids': ['id1'], 'embeddings': None, 'metadatas': [{'key_to_keep': 1}], 'documents': ['document 1'], 'uris': None, 'data': None}
collection.update(ids=records["ids"],documents=records["documents"], embeddings=records["embeddings"],metadatas=records["metadatas"])
# verify
records1 = collection.get(ids=["id1"])
print(records1["metadatas"][0])
# {'key_to_keep': 1, 'key_to_remove': 2}

tazarov avatar Jan 14 '24 06:01 tazarov

@tazarov If I understand correctly this is happening because we're only inserting metadata when updating records and not touching the existing metadata. Does deleting the older metadata for the record and then inserting the new metadata sound like a good approach? If yes, can raise a PR.

GauravWaghmare avatar Jan 15 '24 12:01 GauravWaghmare

@GauravWaghmare, take a look at the CIP PR for this - https://github.com/chroma-core/chroma/pull/1636. There is a bit more "nuance" to things.

I'd love your input on the CIP.

tazarov avatar Jan 28 '24 10:01 tazarov

@tazarov Why should the behaviour for metadata update be any different from document update?

GauravWaghmare avatar Jan 31 '24 12:01 GauravWaghmare

@GauravWaghmare, to accommodate different use cases of how people update metadata. Users will update a document in a single overwriting operation. The intent is clear, and they don't necessarily need to know what is in the document to make the update. Metadata differs significantly, depending on the use case. The PR addresses as many use cases as I could think of:

  • Clear the metadata
  • Partial update
  • Full overwrite of metadata

Please elaborate on your thoughts in the PR if you feel the above use cases can somehow be merged into one or if you have a different idea about the implementation.

tazarov avatar Feb 08 '24 17:02 tazarov

Will track in #839

itaismith avatar Jan 07 '25 07:01 itaismith