search icon indicating copy to clipboard operation
search copied to clipboard

[Core] Bulk Action on Connection and Engine

Open alexander-schranz opened this issue 3 years ago • 2 comments

It should be possible to provide Bulk Action on Connection and Engine.

$engine->bulk(string $index, \Generator $saveDocuments, \Generator $deleteDocuments, bulkSize: 100);

// and

$indexer->bulk(Index $index, \Generator $saveDocuments, \Generator $deleteDocuments, bulkSize: 100);

Engine not supporting bulk action can fallback to basic save and delete methods.

The documents should be able to be aryor \Generator for performance.

A BulkableIndexerInterface should be added so the fallback to the save/delete can be part of the Engine and not every adapter itself need to implement it.

Bulk Action make maybe sense in case of:

Implementing of Reindex Providers #16

alexander-schranz avatar Jan 03 '23 21:01 alexander-schranz

I was just playing with loupe adapter and without batch support I was indexing ~5k recors about 35minutes, so I've implemented own reindexing and with 100 batch it did take around 30s. I guess performance will suffer on every adapter when importing more documents, so batch API is must. I like example of the API you have provided btw. Not sure how many backend support delete and update/create in same batch but I guess that it make sense you do want both operation on dataset.

zajca avatar Mar 25 '24 07:03 zajca

@zajca thx for giving SEAL a try and give feedback here. For the adapters not supporting batch or have seperate APIs for update/create and batch I'm planning to fallback to the normal APIs. So they may not have performance improvement but the API from outside for all adapters will be the same and so people using SEAL don't need create different code depending on the Adapters.

alexander-schranz avatar Mar 25 '24 08:03 alexander-schranz

this is in progress: https://github.com/schranz-search/schranz-search/pull/430

alexander-schranz avatar Oct 06 '24 12:10 alexander-schranz