Elasticsearch/multi channel performance issue
Describe the bug The multi channel indexing with the elasticsearch plugin seems to be very unefficient and also causes buffer overflow and exception.
-
For each and every product update event the plugin sends a delete document for documents with an ID composed of the channelID and the productID. That means that if I have 1600 channels, then one update means 1600 delete operations are sent in the bulk message.
-
Elasticsearch plugin generates a separate record for each channels a product(variant) belongs to. However the product variant document also have a channelIds field where I can filter for the channels, I think one record per variant with this field should be enough (and can be filtered too)
-
Because of the default bulkSize (2000) and the 1600 events (that is 2000*1600 record in my case) for one reindexing batch, and the ES throws an exception that the bulk request limit (512mb!:) was reached.
To Reproduce Set up a deployment with many channels with many products. Load the products and see the load/exception it causes.
Expected behavior Channels should be only the property of the productvariant document and the channel Id should not be part of the document id. This approach would automatically resolve the issue with deletes too.
Environment (please complete the following information):
- @vendure/core version: 2.0.1
- Nodejs version 18.16
- Database (mysql/postgres etc): postgres
Hi,
Thanks for the report. You are really pushing this plugin to the limit with 1600 channels 😄 , but of course we should be able to handle this.
Are you interested in attempting to fix this yourself and make a pull request? Let me know if so, and I can provide guidance as needed.
Closed automatically because there has been no activity for {{days-before-close}} days after a reminder. Please comment with new information and remove the label to re-open.