magento2
magento2 copied to clipboard
Price index performance on very large stores
Preconditions and environment
- Magento version 2.4.5+ (or any release after 263c17f)
- Have a lot of products (the more, with more prices, the easier this issue is to reproduce for demo)
Steps to reproduce
- Reindex catalog_product_index_price
- Observe that since 263c17f the performance of deleteOutdatedData is much worse (around 800-1200x slower than it should be in our very large instances with ~30 million prices)
Expected result
Price reindex is workable, and uses indexes for its join in deleteOutdatedData, because they exist
Actual result
It's 800-1200x slower than it should be, making a reindex take multiple days (based on the runtime of the individual queries, but can't tell for sure, as i'm not waiting that long), when on previous versions it took 4 hours
Additional information
When catalog_product_index_price_temp is created, it's based on catalog_product_index_price_tmp , which since 263c17f has had different primary key and indexes to the main table, meaning this JOIN has no indexes to use for the select, and is forced to do a full tablescan
We reproduced it as follows:
-
All indexes have been switched to
realtimemode -bin/magento indexer:set-mode realtime -
Then we started indexing only for MSI -
bin/magento indexer:reindex inventory
As @vladimirspucko described above, the inventory indexing itself was performed quickly, about a minute for 850k products in our case. But then a plugin came into operation, which starts price indexing in realtime mode:
\Magento\InventoryCatalog\Plugin\InventoryIndexer\Indexer\Stock\Strategy\Sync\PriceIndexUpdatePlugin::afterExecuteList()
which calls \Magento\Catalog\Model\Indexer\Product\Price\Processor::reindexList()
And if the mode is not schedule this indexing has never been performed on our installation with 850k products.
Note: if we use schedule mode, then full inventory indexing takes about 50 seconds for all the same 850k products
Release note
No response
Triage and priority
- [X] Severity: S0 - Affects critical data or functionality and leaves users without workaround.
- [ ] Severity: S1 - Affects critical data or functionality and forces users to employ a workaround.
- [ ] Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
- [ ] Severity: S3 - Affects non-critical data or functionality and does not force users to employ a workaround.
- [ ] Severity: S4 - Affects aesthetics, professional look and feel, “quality” or “usability”.
Hi @nicka101. Thank you for your report. To speed up processing of this issue, make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, Add a comment to the issue:
@magento give me 2.4-develop instance- upcoming 2.4.x release- For more details, review the Magento Contributor Assistant documentation.
- Add a comment to assign the issue:
@magento I am working on this - To learn more about issue processing workflow, refer to the Code Contributions.
Join Magento Community Engineering Slack and ask your questions in #github channel. :warning: According to the Magento Contribution requirements, all issues must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting. :clock10: You can find the schedule on the Magento Community Calendar page. :telephone_receiver: The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, join the Community Contributions Triage session to discuss the appropriate ticket.
Hi @engcom-Dash. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:
-
- Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).
-
- Verify that issue has a meaningful description and provides enough information to reproduce the issue.
-
- Add
Area: XXXXXlabel to the ticket, indicating the functional areas it may be related to.
- Add
-
- Verify that the issue is reproducible on
2.4-developbranchDetails
- Add the comment@magento give me 2.4-develop instanceto deploy test instance on Magento infrastructure.
- If the issue is reproducible on2.4-developbranch, please, add the labelReproduced on 2.4.x.
- If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!
- Verify that the issue is reproducible on
- Join Magento Community Engineering Slack and ask your questions in #github channel.
Hi @engcom-Hotel. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:
- [ ] 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).
- [ ] 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue.
- [ ] 3. Add
Area: XXXXXlabel to the ticket, indicating the functional areas it may be related to. - [ ] 4. Verify that the issue is reproducible on
2.4-developbranchDetails
- Add the comment@magento give me 2.4-develop instanceto deploy test instance on Magento infrastructure.
- If the issue is reproducible on2.4-developbranch, please, add the labelReproduced on 2.4.x.
- If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here! - [ ] 5. Add label
Issue: Confirmedonce verification is complete. - [ ] 6. Make sure that automatic system confirms that report has been added to the backlog.
Hello @nicka101,
Thanks for the report and contribution!
We have tried to reproduce the issue in 2.4-develop as well as 2.4.4 just prior to 2.4.5 (Before that commit), but it seems the issue is not reproducible for us.
We have tried to reproduce the issue with almost 39360 products, by changing the price of the products but for us the reindexing took a similar time.
Have you checked the same with some profiling tools?
Thanks
Faced with the same issue on a catalog of 850k products, the proposed solution really speeds up the indexing process.
Hello @DmitryFurs,
Thanks for the response!
Can you please let us know the steps to reproduce the issue? Actually, in the first attempt, we were unable to reproduce the issue. You can got through our comment here https://github.com/magento/magento2/issues/37700#issuecomment-1680321607.
Thanks
We faced the same issue using Magento 2.4.6 + MSI. We have ~2600 products (simple, configurable, bundle) and 9 sources/stocks. In our particular case, the price index itself takes less than 1 minute but the inventory index takes more than 1 hour. With the proposed solution, the inventory index takes less than 2 minutes now.
Hi @engcom-Hotel,
We reproduced it as follows:
-
All indexes have been switched to
realtimemode -bin/magento indexer:set-mode realtime -
Then we started indexing only for MSI -
bin/magento indexer:reindex inventory
As @vladimirspucko described above, the inventory indexing itself was performed quickly, about a minute for 850k products in our case. But then a plugin came into operation, which starts price indexing in realtime mode:
\Magento\InventoryCatalog\Plugin\InventoryIndexer\Indexer\Stock\Strategy\Sync\PriceIndexUpdatePlugin::afterExecuteList()
which calls \Magento\Catalog\Model\Indexer\Product\Price\Processor::reindexList()
And if the mode is not schedule this indexing has never been performed on our installation with 850k products.
Note: if we use schedule mode, then full inventory indexing takes about 50 seconds for all the same 850k products
Hello @DmitryFurs,
Thanks for the quick response!
The issue is reproducible for us in the 2.4-develop branch. We have tried to reproduce the issue with almost 40K products.
Hence confirming the issue.
Thanks
:white_check_mark: Jira issue https://jira.corp.adobe.com/browse/AC-9470 is successfully created for this GitHub issue.
:white_check_mark: Confirmed by @engcom-Hotel. Thank you for verifying the issue.
Issue Available: @engcom-Hotel, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.
This is really a big issue. My partial price index takes several hours. I solved it by just adding an index to the "catalog_product_index_price_tmp" for "entity_id", "customer_group_id", "website_id"
Any updates on this issue? In our case, reindexing of 100k products takes hours...
On custom product bulk import, we try to reindex only changed products using the reindexList method. It takes hours. If we do the full reindex with the reindexAll method, it takes a couple of minutes.
In Magento 2.4.7, the issue is still there.
Interestingly, there's a quality patch ACSD-56415 Fixes the issue where the performance of the partial price indexing is slowed down due to a DELETE query when the database has a lot of partial price data to index for 2.4.6-p3 which fixes the issue but this patch is not available for 2.4.7. The ACSD-56415 patch does exactly the same as what @mvenghaus mentioned.
I'm surprised that maintainers didn't mention this patch here as a possible solution.
I've adopted the ACSD-56415 patch to 2.4.7 by converting it to a composer patch and can confirm that it fixes the issue on my side. For example, without the patch, on my dev machine, 250 products were reindexed in 10 mins so I didn't even wait for the full reindex. It's clear that it'll take hours for 50K products. The full reindex of the 2M products database took 80 mins. With the patch applied, 50K products were reindexed in 1 minute.
Hello @speller,
The patch will be incorporated into the forthcoming release, namely 2.4.8-beta1. But we will update you if this can be incorporated sooner.
Thanks
@engcom-Hotel could you also make the patch available for 2.4.7?
@speller I will update you on this soon.
This was actually introduced in 2.4.6: https://github.com/magento/magento2/blob/2.4.6/app/code/Magento/Catalog/Model/Indexer/Product/Price/AbstractAction.php
We've experienced the indexing to not complete even though it was left running overnight from the end of the working day to the start of the new one due to a very large amount of prices; there's multiple stores, multiple customer groups and a lot of products.
I can see that the solution proposed at https://github.com/magento/magento2/pull/37701 has been rejected because indexes slow down table writes. However, the table writes are for a temporary table that is only involved in the indexing process and for the most part the tradeoff is better overall.
I also tried changing the sql query itself to improve performance. This worked well, but I've found the best results, for our case anyway, was to combine both the query change and add indexes to the temporary table catalog_product_index_price_tmp as in the pull request https://github.com/magento/magento2/pull/37701 (also linked above).
magento/vendor/magento/module-catalog/etc/db_schema.xml
<constraint xsi:type="unique" referenceId="CAT_PRD_IDX_PRICE_TMP_ENTT_ID_CSTR_GROUP_ID_WS_ID">
<column name="entity_id"/>
<column name="customer_group_id"/>
<column name="website_id"/>
</constraint>
Note I've not fully tested data integrity yet with this. Probably be on Monday 17th I get back to look at that. I'll post a patch if all looks good then.
EDIT: See below
This was actually introduced in 2.4.6: https://github.com/magento/magento2/blob/2.4.6/app/code/Magento/Catalog/Model/Indexer/Product/Price/AbstractAction.php
We've experienced the indexing to not complete even though it was left running overnight from the end of the working day to the start of the new one due to a very large amount of prices; there's multiple stores, multiple customer ....
On further testing we've found that the change on the sql query only makes a slight difference so in terms in risk and reward, it's probably better just leaving the query alone.
If anyone needs the patch (based on https://github.com/magento/magento2/pull/37701) to change the table indexes then here's a patch for 2.4.7:
37700_inventoryIndexer.patch
diff --git a/vendor/magento/module-catalog/etc/db_schema_whitelist.json b/vendor/magento/module-catalog/etc/db_schema_whitelist.json
--- a/vendor/magento/module-catalog/etc/db_schema_whitelist.json 2024-06-17 11:12:06.595496943 +0100
+++ b/vendor/magento/module-catalog/etc/db_schema_whitelist.json 2024-06-14 08:27:31.648158048 +0100
@@ -987,7 +987,8 @@
"final_price": true,
"min_price": true,
"max_price": true,
- "tier_price": true
+ "tier_price": true,
+ "id": true
},
"index": {
"CATALOG_PRODUCT_INDEX_PRICE_TMP_CUSTOMER_GROUP_ID": true,
@@ -995,7 +996,8 @@
"CATALOG_PRODUCT_INDEX_PRICE_TMP_MIN_PRICE": true
},
"constraint": {
- "PRIMARY": true
+ "PRIMARY": true,
+ "CAT_PRD_IDX_PRICE_TMP_ENTT_ID_CSTR_GROUP_ID_WS_ID": true
}
},
"catalog_category_product_index_tmp": {
diff --git a/vendor/magento/module-catalog/etc/db_schema.xml b/vendor/magento/module-catalog/etc/db_schema.xml
--- a/vendor/magento/module-catalog/etc/db_schema.xml 2024-06-17 11:11:56.495474101 +0100
+++ b/vendor/magento/module-catalog/etc/db_schema.xml 2024-06-14 08:28:02.284226062 +0100
@@ -1645,6 +1645,11 @@
<constraint xsi:type="primary" referenceId="PRIMARY">
<column name="id"/>
</constraint>
+ <constraint xsi:type="unique" referenceId="CAT_PRD_IDX_PRICE_TMP_ENTT_ID_CSTR_GROUP_ID_WS_ID">
+ <column name="entity_id"/>
+ <column name="customer_group_id"/>
+ <column name="website_id"/>
+ </constraint>
</table>
<table name="catalog_category_product_index_tmp" resource="default" engine="innodb"
comment="Catalog Category Product Indexer temporary table">
Apply patch with patch -p1 --forward < 37700_inventoryIndexer.patch
::EDIT::
⚠️ WARNING: Do not use on Magento 2.4.8, it will break your schema declarations.
would it be possible to apply the patch to magento 2.3.7 ? https://github.com/nicka101/magento2/commit/90e4ecc06fa1be08f736c8a5fbe699430e9bc3c7
@engcom-Hotel
The patch in the Quality Patch Tool is not available yet for 2.4.7, and the change is not implemented in 2.4.7-p4 (only in 2.4.-develop with 2.4.8-beta1 as the first tag containing the commit, see details).
Please can you check if the patch will be released officially for 2.4.7?
Hello @mzeis,
Thanks for raising the concern!
We are in discussion for this concern with the internal team. I will update you soon on this. Meanwhile moving this issue on hold.
Thanks
Hello @mzeis,
We have an update from the internal team, they have created this JIRA for the patch release for 2.4.7 release line.
I will update you once same has been done. Till the time keeping this issue On Hold status.
Thanks
Hello @mzeis @nicka101,
We have received a reply from the internal team, the patch for this issue has been available now. Now the patch should be available for download.
Thanks