magento2-gdpr
magento2-gdpr copied to clipboard
[BUG] entity_id + entity_type aren't unique in erase_entity table
Description
Currently the combination of entity_id and entity_type aren't unique in the opengento_gdpr_erase_entity table.
Over time this leads to slow processing of entities by the cron task since it tries to process the same entity multiple times.
I'm not sure how the same entity ends up that many times in this table to begin with, but regardless making the columns unique should prevent it in future attempts.
That said, the patch is not obvious since existing installations would be affected since you can't make columns unique if they have duplicate values.
Prerequisites
PHP Version:
- 8.1
Magento Version:
- 2.4.4
Module Version:
- 4.2.3
Issue Details
Steps to reproduce the behavior
- Unsure how to reproduce the behavior that gets multiple of the same entity in the table (eg customer of id 17 ends up in there multiple times)
- This causes the erase_entity cron task to get progressively slower over time as more and more entities are added.
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
Additional context
For reference I have over 180k rows in this table, despite only 211 unique entities
Hello @Quazz
Thank you for reporting this issue. I do know why it occurs. Indeed it's related to the cron job that schedule the erase for entities older than a configured value. This cron does not check if the entity is already erased, so we do have multiple time the record in the database, which is really bad by time... I'm currently checking how to fix this mess. Thank you for bringing this to my attention!
Hello @Quazz
The fix delivered in c8ae9342dc0347984f5e94dda9516edd9d84301a should prevent any more duplicates in the future. I agree that a constraint should also be added in the declarative schema. However it does not seems possible to run a patch before.