chroma icon indicating copy to clipboard operation
chroma copied to clipboard

[Bug]: Can't set max batch size

Open superchargez opened this issue 1 year ago • 2 comments

What happened?

I tried adding large number of documents but hit the limit of 5461, so I tried to change the limit but nothing happened:

In [16]: client.max_batch_size = 44445

In [17]: client.get_max_batch_size() Out[17]: 5461

Versions

Window 11 Python 3.12.4 Chroma version: '0.5.3'

Relevant log output

No output: just did not update the batch size limit.

superchargez avatar Jun 22 '24 18:06 superchargez

@superchargez, the max batch size in Chroma is a function of underlying SQLite. Most OS comes with a built-in release of sqlite3 (most of the time Chroma relies on those). The pre-build SQLite distros have been compiled with certain limits on which the max match size is based, and unfortunately, they cannot be changed.

Exposing the max batch size is intended to make users aware of the SQLite limits Chroma enforces. The general approach in cases like yours is to split up your large batch. We have provided an example of how to do that here - https://github.com/chroma-core/chroma/blob/main/chromadb/utils/batch_utils.py

tazarov avatar Jun 23 '24 09:06 tazarov

@superchargez, did the above answer your question?

It is also worth noting that recently, we moved the max_batch_size property to a method get_max_batch_size() method (#2305) to better convey that this property cannot be set.

tazarov avatar Jul 22 '24 18:07 tazarov

@tazarov yes, thank you it answers the question.

superchargez avatar Sep 16 '24 11:09 superchargez

Closing this as it is resolved. Let me know if you'd like to re-open

jeffchuber avatar Sep 16 '24 15:09 jeffchuber