chroma icon indicating copy to clipboard operation
chroma copied to clipboard

[Bug]: Adding records with same id accidentally leads to log messages on every subsequent operation

Open vikas2131 opened this issue 9 months ago • 2 comments

What happened?

I was inserting data chunks through my custom code, in which I update specific parameters like start ID, metadata, etc every time I add new data. In one instance I forgot to change the start ID which resulted in adding the data with the existing ID in the chromadb persistent client.

Now, I know that data is not updated, and I just need to re-add that data with the new correct start ID. But with the new operations, the same warning message still pops up.

Add of existing embedding ID: 1241
Add of existing embedding ID: 1242
Add of existing embedding ID: 1243
Add of existing embedding ID: 1245
Add of existing embedding ID: 1244
Add of existing embedding ID: 1245
Insert of existing embedding ID: 1241
Insert of existing embedding ID: 1242
Insert of existing embedding ID: 1243
Insert of existing embedding ID: 1244
Insert of existing embedding ID: 1245

I searched for a solution and figured out that Chromadb stores all operations in Chroma.sqlite under embedding_queue.

Now, to solve that log warning, I manually deleted the entries with the same id operation I executed through my code. It worked flawlessly and solved the issue. I believe this is a bug or issue.

But I want to know, does manually changing these entries in embedding_queue will affect my db. I'm careful will doing these deletions for the entries.

My version of chromadb: 0.6.3 using DB browser for sqilite application to edit chroma.sqlite Using persistent instance my os and version: macOS Sonoma 14.4.1

Versions

Chromadb: 0.6.3, python: 3.9.21, macOS: sonoma 14.4.1

Relevant log output


vikas2131 avatar Mar 18 '25 08:03 vikas2131

@vikas2131, I think this is duplicate of #4076. Can you take a look at the discussion here and let me know if this explains how Chroma works - https://github.com/chroma-core/chroma/issues/4076#issuecomment-2763367067

May I suggest you try upsert() so that 1) you don't see these warnings, 2) your end result overrides the existing doc if it exists, or creates a new one if it doesn't, 3) you don't have to fiddle around with Chroma internals to get this to work.

tazarov avatar Apr 01 '25 13:04 tazarov

@vikas2131 did you get to the bottom of this? was @tazarov's comment helpful here?

jeffchuber avatar Apr 16 '25 00:04 jeffchuber

closing now since it was addressed here https://github.com/chroma-core/chroma/issues/4076

jairad26 avatar Jun 30 '25 17:06 jairad26