Allow users to configure `embed_batch_size` or `ThreadPoolExecutor` size when calling `Client.embed`
It looks like batching was added in #437 - thank you for implementing this, it's very helpful.
I notice that batching, as defined here, depends on a fixed batch size. This can be suboptimal for clients submitting a large number of smaller documents, as we cannot configure the ThreadPoolExecutor size to parallelize a large number of small data payloads. As a result a client might end up blocking while waiting for small network responses.
Would it be possible to allow clients to configure either the ThreadPoolExecutor size or the embed_batch_size setting when calling embed?
Hey @acompa thanks for the feedback, this will be fixed by https://github.com/cohere-ai/cohere-python/pull/536 ! you will be able to pass your own executor in. I'll ping the thread when it's released