elasticsearch-js icon indicating copy to clipboard operation
elasticsearch-js copied to clipboard

Bulk helper: Support streaming use/inifinite streams or generators

Open spinscale opened this issue 8 months ago • 2 comments

🚀 Feature Proposal

A common use-case for apps are infinite consumers, that pass data over to Elasticsearch via bulk requests. Being a Java Client users for many years I thought all clients operate the same and do support this. However Martijn corrected my assumption in the forum that there is a differentiation between push and pull based bulk ingestion helpers in the various clients.

My basic idea would be adding support (or maybe it already works and just requires documentation updates) for endless ingestion by providing a bulk helper. This way I could use something like queueable to keeping adding data that then gets consumed by a bulk helper.

As mentioned in the thread, there may be corner cases (like the queue being empty longer than the flush interval), that need to be covered.

Also in order to align with the other clients, adding another document count threshold to the bulk helper could make sense.

Motivation

This will make it easier to implement any kind of continously polling/streaming service that needs to bulk index data into Elasticsearch.

Example

I'd assume there is no change in the bulk API actually (maybe also add number of documents), but it allows parsing a generator that is infinite.

P.S. If this already works as expected, please close - there is still the possibility I missed this in the docs and just asked around for nothing cause everything works as expected 😀

spinscale avatar May 31 '24 07:05 spinscale