magento2 icon indicating copy to clipboard operation
magento2 copied to clipboard

Missing indexing of required columns in catalog_product_entity_int table

Open rostilos opened this issue 1 year ago • 3 comments

Preconditions and environment

  • Magento version : CE 2.4.6
  • MariaDB version : 10.6.16
  • RAM: 16 GB
  • CPU Cores: 8
  • Large catalog ( 500k+ products )

The problem was critical to the extent that the storefront was basically inaccessible

No reindex + everywhere where catalog_product_entity_int table was involved in queries - the query never completed, which broke the functionality of the entire storefront

Steps to reproduce

  1. A relatively large catalog is needed. In a test environment, it is reproduced on a test catalog generated with the bin/magento setup:perf:generate-fixtures setup/performance-toolkit/profiles/ce/extra_large.xml
  2. Start the reindex process bin/magento indexer:reset bin/magento indexer:reindex catalogsearch_fulltext

Expected result

Reindexing is successful, there are docs in elasticsearch/opensearch indexes

Actual result

Reindexing is not happening. Indexes are created in elasticsearch, but the data does not get into them

Additional information

In the process of reindexing the catalogsearch_fulltext index, a query is made to the database to retrieve the " batch" data, as an example :

SELECT `e`.`entity_id`, `e`.`type_id`, `e`.`sku` FROM `catalog_product_entity` AS `e`
INNER JOIN `catalog_product_website` AS `website` ON website.product_id = e.entity_id AND website.website_id = '1'
INNER JOIN `catalog_product_entity_int` AS `visibility_default` ON visibility_default.entity_id= e.entity_id AND visibility_default.attribute_id = '99' AND visibility_default.store_id = 0
LEFT JOIN `catalog_product_entity_int` AS `visibility_store` ON visibility_store.entity_id= e.entity_id AND visibility_store.attribute_id = '99' AND visibility_store.store_id = 2
INNER JOIN `catalog_product_entity_int` AS `status_default` ON status_default.entity_id= e.entity_id AND status_default.attribute_id = '97' AND status_default.store_id = 0
LEFT JOIN `catalog_product_entity_int` AS `status_store` ON status_store.entity_id= e.entity_id AND status_store.attribute_id = '97' AND status_store.store_id = 2 WHERE (IF(visibility_store.value_id > 0, visibility_store.value, visibility_default.value) IN (3, 2, 4)) AND (IF(status_store.value_id > 0, status_store.value, status_default.value) IN (1)) AND (e.entity_id > 2156) ORDER BY `e`.`entity_id` ASC
LIMIT 500;

In a normal situation this request should take up to a second to execute (rough approximation). In my case, on a real project, it was executed for several hours before the request timed out ( The screenshot only shows 10 minutes, but even that is extremely long ) 2023-12-25_11-05 2023-12-01_10-30

Further investigation revealed that the problem is in catalog_product_entity_int , namely missing columns in the indexes.

After adding indexes for the specified table, the query started to succeed in short order

Release note

No response

Triage and priority

  • [X] Severity: S0 - Affects critical data or functionality and leaves users without workaround.
  • [ ] Severity: S1 - Affects critical data or functionality and forces users to employ a workaround.
  • [ ] Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
  • [ ] Severity: S3 - Affects non-critical data or functionality and does not force users to employ a workaround.
  • [ ] Severity: S4 - Affects aesthetics, professional look and feel, “quality” or “usability”.

rostilos avatar Dec 25 '23 19:12 rostilos

Hi @rostilos. Thank you for your report. To speed up processing of this issue, make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, Add a comment to the issue:


Join Magento Community Engineering Slack and ask your questions in #github channel. :warning: According to the Magento Contribution requirements, all issues must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting. :clock10: You can find the schedule on the Magento Community Calendar page. :telephone_receiver: The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, join the Community Contributions Triage session to discuss the appropriate ticket.

m2-assistant[bot] avatar Dec 25 '23 19:12 m2-assistant[bot]

@magento I am working on this

rostilos avatar Dec 25 '23 19:12 rostilos

Temp. patch ( for CE version only )

In EE and B2B versions you need to replace entity_id with row_id

Index: vendor/magento/module-catalog/etc/db_schema.xml
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/vendor/magento/module-catalog/etc/db_schema.xml b/vendor/magento/module-catalog/etc/db_schema.xml
--- a/vendor/magento/module-catalog/etc/db_schema.xml	
+++ b/vendor/magento/module-catalog/etc/db_schema.xml	(date 1703684640987)
@@ -132,8 +132,9 @@
             <column name="attribute_id"/>
             <column name="store_id"/>
         </constraint>
-        <index referenceId="CATALOG_PRODUCT_ENTITY_INT_ATTRIBUTE_ID" indexType="btree">
+        <index referenceId="CATALOG_PRODUCT_ENTITY_INT_ATTRIBUTE_ID_ENTITY_ID" indexType="btree">
             <column name="attribute_id"/>
+            <column name="entity_id"/>
         </index>
         <index referenceId="CATALOG_PRODUCT_ENTITY_INT_STORE_ID" indexType="btree">
             <column name="store_id"/>

rostilos avatar Dec 27 '23 13:12 rostilos

Hi @engcom-Hotel. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

  • [ ] 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).
  • [ ] 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue.
  • [ ] 3. Add Area: XXXXX label to the ticket, indicating the functional areas it may be related to.
  • [ ] 4. Verify that the issue is reproducible on 2.4-develop branch
    Details- Add the comment @magento give me 2.4-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.4-develop branch, please, add the label Reproduced on 2.4.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!
  • [ ] 5. Add label Issue: Confirmed once verification is complete.
  • [ ] 6. Make sure that automatic system confirms that report has been added to the backlog.

m2-assistant[bot] avatar Jan 15 '24 06:01 m2-assistant[bot]

Hello @rostilos,

Thanks for the report and collaboration!

Reproduction of the scenario mentioned in the main description is not possible, but after going through with the issue description and looking into the codebase, the scenario may happened because of the missing index in table catalog_product_entity_int on the entity_id column.

Hence confirming this issue for further processing.

Thanks

engcom-Hotel avatar Jan 15 '24 11:01 engcom-Hotel

:white_check_mark: Jira issue https://jira.corp.adobe.com/browse/AC-10844 is successfully created for this GitHub issue.

github-jira-sync-bot avatar Jan 15 '24 11:01 github-jira-sync-bot

:white_check_mark: Confirmed by @engcom-Hotel. Thank you for verifying the issue.
Issue Available: @engcom-Hotel, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.

m2-assistant[bot] avatar Jan 15 '24 11:01 m2-assistant[bot]