This code is not performant; locally it is being fed 6113 values of 2 items each to delete.
https://github.com/moqui/moqui-framework/blob/005f2acdc1bb412930827f00bbbd3cb35cbfafd2/framework/src/main/groovy/org/moqui/impl/entity/EntityDataFeed.groovy#L549
This was originally filed in moqui/mantle-usl#184
This problem user has 134 orders, if that helps with understanding the scope.
Got it. The billing loop code calls "update#OrderItem" thousands of times, in a loop. That causes a trigger to fire, which adds a check to delete the DataDocument. Well, since it is the very same OrderItem each time, that happily ends up getting added to EntityDataFeed via DataFeedSynchronization, which never dedupes the calls.
I have re-written the billing logic, so it no longer calls update#OrderItem, but this still seems like it is also a bug in moqui itself, for not deduping the delete calls at a higher level.
So here is the full flow.
- Call update#OrderItem(with an adjusted quantity)
- This calls update#mantle.order.OrderItem, which just does a direct database update call.
- Order has changed, trigger handle#OrderMajorChange
- This deletes all OrderItem where isPromo="Y"
- This calls EntityValueBase.delete()
- Next is EntityDataFeed.dataFeedCheckDelete()
- DataFeedSynchronization.addDeleteToFeed(*), which adds to EntityList deleteValues.
Repeat the above in a loop(thousands of times), and the deleted OrderItem will be recorded multiple times, as orderItemSeqId is re-used.
The bug in moqui is that the entity added to deleteValues needs to be de-duplicated based on the primary key of the value, before then being sent on(eventually) to elasticsearch. I would probably also dedupe new values, if they are actually updated; only the most recent value should be index.
I looked at this a bit in the EntityDataFeed.groovy file because I thought it did do some form of de-duping, and it does. See the primaryPkFieldValues variable and how it is used in the EntityDataFeed.FeedRunnable.feedDataDocument() method.
There are limitations to this. One is that it only works within the scope of a single transaction. If the same DataDocument, such as an OrderItem based one (ie OrderItem is the primary entity), is triggered in different transactions that will be fed multiple times. If a large number of OrderItem records are updated in the same transaction, even if the same OrderItem record is updated multiple times in that transaction, only one DataDocument for the OrderItem will be generated and that will be done when the transaction commits.
The delete side of things is indeed missing any sort of de-duplication. See the EntityDataFeed.FeedRunnable.run() method, currently on line 548. Perhaps the best approach would be to de-dupe the deleteValues list by PK values right there in the run() method.
On other considerations, I'm wondering if a data feed has an update (or create) AND a delete in the same transaction then what to do with it... and to answer that we need to know what happened first or what the final state is in the database at the end of the transaction. My thought on the solution to this is that the create/update data feed queries know which ones are still in the database and that gives us two options: do the deletes first, or do the create/update feed first and keep the Set of all DataDocument PK values (for all primary entities found that need updates) and skip those in the deleted values.
One of those is needed for the code to be correct for scenarios where a record for an entity that is the primary entity in a DataDocument is deleted and then re-created with the same primary key.
I'll work on these soon if no one beats me to it, these comments are based on my initial research on the topic and for future reference.