chroma
                                
                                 chroma copied to clipboard
                                
                                    chroma copied to clipboard
                            
                            
                            
                        [Bug]: Problem with data relevance when working with the database in different processes
What happened?
There is a web application that uses langchain and  chroma. If a script is run in parallel with this application that deletes some data from the database, this data is partially available in the main application, especially in the langchain's max_marginal_relevance_search_by_vector function, which executes a query using the chroma's .query method . What's interesting is that only ids and embeddings of deleted records are returned, however documents and metadatas are None.
The result of such a query looks something like this:
result = {
  "ids":  [['9232', '9133', '9392', '9132', '9233', '9037', '9006', '9394', '9134', '9236', '9234', '9395', '9131', '9007', '9396', '9393', '8952', '8954', '8953', '9235']],
  "documents": [[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, 'document #17', 'document #18', 'document #19', None]],
  "embeddings": [[0.0073, ...], [-0.0076, ...], [0.0086, ...], [-0.0077, ...], [-0.0007, ...], [-0.0008, ...], [0.0081, ...], [0.0047, ...], [-0.0078, ...], [0.0032, ...], [0.0028, ...], [0.0040, ...], [-0.0113, ...], [0.0113, ...], [0.0016, ...], [0.0088, [0.0001, ...], [0.0025, ...], [-0.0012, ...], [0.0050, ...]],
  "metadatas": [[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, {"url": "https://example.com/"}, {"url": "https://example.com/"}, {"url": "https://example.com/"}, None]],
}
Versions
Chroma v0.4.24, Python 3.12.1, Windows 11
Relevant log output
No response
@yegor-matveyas, do you have the application code that deletes the records from Chroma? Can you share it? Is there any chance that the application only does partial deletes?