Allow users to configure `embed_batch_size` or `ThreadPoolExecutor` size when calling `Client.embed`

Open acompa opened this issue 1 year ago • 1 comments

It looks like batching was added in #437 - thank you for implementing this, it's very helpful.

I notice that batching, as defined here, depends on a fixed batch size. This can be suboptimal for clients submitting a large number of smaller documents, as we cannot configure the ThreadPoolExecutor size to parallelize a large number of small data payloads. As a result a client might end up blocking while waiting for small network responses.

Would it be possible to allow clients to configure either the ThreadPoolExecutor size or the embed_batch_size setting when calling embed?

Jul 03 '24 01:07 acompa

Hey @acompa thanks for the feedback, this will be fixed by https://github.com/cohere-ai/cohere-python/pull/536 ! you will be able to pass your own executor in. I'll ping the thread when it's released

Jul 09 '24 11:07 billytrend-cohere