magento-lts icon indicating copy to clipboard operation
magento-lts copied to clipboard

Improve performance of product category rewrites (disabled product & categories and product_use_categories)

Open MarcinNowakMacopedia opened this issue 4 years ago • 14 comments

Description (*)

  1. I fixed issues when it was set not to use „Categories Path for Product URLs” (product_use_categories) in configuration (Catalog → Catalog → Search Engine Optimizations) and yet they can be found in URL Rewrite Management (commit bfe932992f8498a1edbb9221a989f7da98f5c06e)
  2. I added field in configuration to set whether or not to generate rewrites for disabled categories and products (commit 9163a4d468ab93f42024c3677c251f1e5825a97a)

Related Pull Requests

Fixed Issues (if relevant)

  1. Fixes OpenMage/magento-lts#931
  2. Fixes OpenMage/magento-lts#932

Manual testing scenarios (*)

Screenplay:

  1. Present situation a.) Set field product_use_categories in configuration to „yes” b.) Set field create_url_for_disabled in configuration to „yes” c.) In Index Management reindex „Catalog URL Rewrites” d.) In URL Rewrite Management check if there are rewrites for products with categories path, or in DB check if in table core_url_rewrite there are rows with set columns category_id and product_id. e.) In URL Rewrite Management check if there are rewrites for disabled products and categories.

  2. Products rewrites with categories path, but not for disabled products and categories. a.) In URL Rewrite Management check if there are rewrites for products with categories path, or in DB check if in table core_url_rewrite are row with set fields category_id and product_id. b.) In URL Rewrite Management check if there are rewrites for disabled products and categories. c.) Set field create_url_for_disabled in configuration to „no” d.) In Index Management reindex „Catalog URL Rewrites” e.) Check if rewrites checked in a. still exist. f.) Check if rewrites checked in b. were removed.

  3. Products rewrites with categories path, but not for disabled categories. a.) Choose active category. b.) In URL Rewrite Management check if there are rewrites for products with this category path, or in DB check if in table core_url_rewrite there are rows with set columns product_id and this category id in category_id column. c.) In URL Rewrite Management check if there are rewrite for this category. d.) Set field create_url_for_disabled in configuration to „no”. e.) Disable chosen category. f.) Check if rewrites checked in b. and c. were removed.

  4. Product rewrites with categories path, but not for disabled products. a.) Choose active product. b.) In URL Rewrite Management check if there are rewrites for this product. c.) Set field create_url_for_disabled in configuration to „no”. d.) Disable chosen product. e.) Check if rewrites checked in b. were removed.

  5. Product rewrites without categories path and for disabled products and categories. a.) In URL Rewrite Management check if there are rewrites for products with categories path, or in DB check if in table core_url_rewrite there are rows with set columns category_id and product_id. b.) In URL Rewrite Management check if there are rewrites for disabled products and categories. c.) Set field product_use_categories in configuration to „no” d.) In Index Management reindex „Catalog URL Rewrites” e.) Check if rewrites checked in a. were removed. f.) Check if rewrites checked in b. still exist.

  6. Product rewrites without categories path and not for disabled products and categories. a.) In URL Rewrite Management check if there are rewrites for products with categories path, or in DB check if in table core_url_rewrite are rows with set columns category_id and product_id. b.) In URL Rewrite Management check if there are rewrites for disabled products and categories. c.) Set field product_use_categories in configuration to „no” d.) Set field create_url_for_disabled in configuration to „no”. e.) In Index Management reindex „Catalog URL Rewrites” f.) Check if rewrites checked in a. and b. were removed.

Questions or comments

The time it takes for a full reindex on shop with 16802 products and 542 categories:

1. At present (before my changes):

First execute (when empty table core_url_rewrite): 1 – 56s 2 – 54s 3 – 55s 4 – 57s 5 – 55s

Another execute: 1 – 1m 4s 2 – 1m 1s 3 – 1m 3s 4 – 1m 3s 5 – 1m 1s

2. After changes

Configuration:

a.) product_use_categories = true and create_url_for_disabled = true (1. screenplay)

65614 - rewrites

First execute: 1 – 59s 2 – 56s 3 – 56s 4 – 57s 5 – 57s

Another execute: 1 – 1m 6s 2 – 1m 8s 3 – 1m 5s 4 – 1m 5s 5 – 1m 4s

b.) product_use_categories = true and create_url_for_disabled = false (2. screenplay)

47360 – rewrites

First execute: 1 – 50s 2 – 42s 3 – 40s 4 – 45s 5 – 43s

Another execute: 1 – 48s 2 – 50s 3 – 47s 4 – 51s 5 – 52s

From configuration a to b 1 – 47s 2 – 46s 3 – 48s 4 – 46s 5 – 46s

c.) product_use_categories = false and create_url_for_disabled = true (5. screenplay)

17342 - rewrites

First execute: 1 – 15s 2 – 15s 3 – 15s 4 – 15s 5 – 15s

Another execute: 1 – 17s 2 – 17s 3 – 17s 4 – 17s 5 – 17s

From configuration a. to c. 1 – 22s 2 – 21s 3 – 22s 4 – 21s 5 – 21s

d.) product_use_categories = false and create_url_for_disabled = false (6. screenplay)

14715 - rewrites

First execute: 1 – 13s 2 – 13s 3 – 13s 4 – 13s 5 – 13s

Another execute: 1 – 14s 2 – 14s 3 – 14s 4 – 14s 5 – 14s

From configuration a. to d. 1 – 18s 2 – 17s 3 – 17s 4 – 17s 5 – 17s

DB select to find rewrites to remove (app/code/core/Mage/Catalog/Model/Resource/Url.php::clearRewrites):

1. Configuration a.

SELECT `tur`.`url_rewrite_id` FROM `core_url_rewrite` AS `tur`
LEFT JOIN `catalog_category_product` AS `tcp` ON tur.category_id = tcp.category_id AND tur.product_id = tcp.product_id
WHERE (tur.store_id = :store_id) AND (tur.is_system = 1) 
**// product rewrites with categories path and not connected with this category**
AND (tur.category_id IS NOT NULL) AND (tur.product_id IS NOT NULL) AND (tcp.category_id IS NULL) 
GROUP BY `url_rewrite_id`

2. Configuration b.

SELECT `tur`.`url_rewrite_id` FROM `core_url_rewrite` AS `tur`
LEFT JOIN `catalog_product_entity_int` AS `cpei` ON cpei.entity_id = tur.product_id AND cpei.attribute_id = :product_status_attribute_id
LEFT JOIN `catalog_category_entity_int` AS `ccei1` ON ccei1.attribute_id = :is_active_category_attribute_id AND ccei1.store_id = 0 AND ccei1.entity_id = tur.category_id
LEFT JOIN `catalog_category_entity_int` AS `ccei2` ON ccei2.attribute_id = :is_active_category_attribute_id AND ccei2.store_id = :store_id AND ccei2.entity_id = tur.category_id
LEFT JOIN `catalog_category_product` AS `tcp` ON tcp.category_id = tur.category_id AND tcp.product_id = tur.product_id
WHERE (tur.store_id = :store_id) AND (tur.is_system = 1) 
AND (
	(
	**// rewrites for disabled products**
	cpei.value = 2
	**// rewrites for categories disabled in this store or in general store**
	 OR (IF(ccei2.value_id > 0, ccei2.value, ccei1.value) = 0) 
	**// product rewrites with categories path and not connected with this category**
	 OR (tur.category_id IS NOT NULL AND tur.product_id IS NOT NULL AND tcp.category_id IS NULL)
	)
)
GROUP BY `url_rewrite_id`

3. Configuration c. db select to remove rewrites:

SELECT `tur`.`url_rewrite_id` FROM `core_url_rewrite` AS `tur`
WHERE (tur.store_id = :store_id) AND (tur.is_system = 1) 
**// product rewrites with categories path** 
AND (tur.category_id IS NOT NULL) AND (tur.product_id IS NOT NULL) 
GROUP BY `url_rewrite_id`

4. Configuration d.

SELECT `tur`.`url_rewrite_id` FROM `core_url_rewrite` AS `tur`
LEFT JOIN `catalog_product_entity_int` AS `cpei` ON cpei.entity_id = tur.product_id AND cpei.attribute_id = :product_status_attribute_id
LEFT JOIN `catalog_category_entity_int` AS `ccei1` ON ccei1.attribute_id = :is_active_category_attribute_id AND ccei1.store_id = 0 AND ccei1.entity_id = tur.category_id
LEFT JOIN `catalog_category_entity_int` AS `ccei2` ON ccei2.attribute_id = :is_active_category_attribute_id AND ccei2.store_id = :store_id AND ccei2.entity_id = tur.category_id 
WHERE (tur.store_id = :store_id) AND (tur.is_system = 1) 
AND (
	(
	**// rewrites for disabled products**
	cpei.value = 2 
	**// rewrites for categories disabled in this store or in general store**
	OR (IF(ccei2.value_id > 0, ccei2.value, ccei1.value) = 0)
	**// product rewrites with categories path** 
	OR (tur.category_id IS NOT NULL AND tur.product_id IS NOT NULL) 
	)
) 
GROUP BY `url_rewrite_id`

Contribution checklist (*)

  • [ ] Pull request has a meaningful description of its purpose
  • [ ] All commits are accompanied by meaningful commit messages
  • [ ] All automated tests passed successfully (all builds are green)

MarcinNowakMacopedia avatar Jan 19 '21 15:01 MarcinNowakMacopedia

Excellent work @MarcinNowakMacopedia, thanks for contributing!

However, that is a lot of complex code changes to review with not many notes as to what bugs it fixes, what behaviors possibly changed, etc..

I did see the commit messages but a detailed description of the bugs, how to reproduce them, why your PR fixes the problems, what testing you've performed to ensure there are no regressions and some of the raw SQL before and after would really help us to review this and get it accepted much more readily.

colinmollenhour avatar Jan 19 '21 23:01 colinmollenhour

@colinmollenhour should be better now :)

tmotyl avatar Jan 22 '21 10:01 tmotyl

We've been testing this PR in our development environment and the results are quite promising. We didn't see any issues arising from it and plan on deploying this into production soon. Thank you!

The most significant gains come from fixing the bug related to the catalog/seo/product_use_categories option. If people are using canonical urls for products then it's not really doing anything in terms of SEO (please correct me if I'm wrong). It's nice to have the product's URL matching the hierarchy during navigation but it's a trade-off we can accept to improve performance.

We have 99 store views with ~900 categories and ~20000 products. A single store view used to take us as much as 120 seconds to refresh and with this patch, ignoring disabled products/categories as well as disabling category-product urls, it can take between 15 to 25 seconds per store. This means we can reindex our entire store in 30 to 40 minutes.

rvelhote avatar Apr 19 '21 18:04 rvelhote

It took quite some time but we finally started using this patch recently. We didn't make a full reindex yet but everything is working as it should for newer and updated products and categories.

In general I would propose this PR to be merged into the mainline OpenMage LTS.

The only change I would recommend, thinking of a wider audience, is to display a message, when switching product_use_categories from No to Yes, that reindexing URL Rewrites is recommended. I was even thinking that the original intention of still generating URLs was for a quicker transition and not actually an oversight (arguable but possible).

rvelhote avatar Oct 11 '21 11:10 rvelhote

Reporting back with an issue we found using this PR. When the option to remove URLs of disabled products and categories is enabled, if you have a bunch of products that were disabled and enable them via a mass action the URLs will not be generated.

rvelhote avatar Oct 15 '21 06:10 rvelhote

The only change I would recommend, thinking of a wider audience, is to display a message, when switching product_use_categories from No to Yes, that reindexing URL Rewrites is recommended. I was even thinking that the original intention of still generating URLs was for a quicker transition and not actually an oversight (arguable but possible).

Added

MarcinNowakMacopedia avatar Apr 27 '22 19:04 MarcinNowakMacopedia

Reporting back with an issue we found using this PR. When the option to remove URLs of disabled products and categories is enabled, if you have a bunch of products that were disabled and enable them via a mass action the URLs will not be generated.

Mage_Catalog_Model_Indexer_Url didn't support mass action events. I have added this. Please check if it works now.

Sorry for my late answer. I hope it is still helpful.

MarcinNowakMacopedia avatar Apr 27 '22 19:04 MarcinNowakMacopedia

rebased

tmotyl avatar Jun 06 '22 10:06 tmotyl

fixed issues reported by phpstan

tmotyl avatar Jun 21 '22 06:06 tmotyl

Anyone want to test this on their staging environments and report back? I currently don't have a production-size instance to test with.

colinmollenhour avatar Aug 03 '22 16:08 colinmollenhour

@colinmollenhour We've been using this PR for almost a year minus the changes that happened after my last message in October 2021. Even without the most recent changes that solve a couple of problems I reported, we have not encountered any breaking issues in our shop.

Currently, we only disabled Use Categories Path for Product URLs and kept generating the rewrites for disabled products/categories (due to the issue I previously reported with mass actions). I will create a task for my team to track the latest changes and report back ASAP.

We have 101 store views, ~32000 products and ~2500 categories (enabled and disabled). This has been an excellent patch for us.

rvelhote avatar Aug 11 '22 08:08 rvelhote

@colinmollenhour I added Mage_Adminhtml_Model_System_Config_Backend_Seo to change the index status of categories and products after changing create_url_for_disabled in configuration. Additionally, I modified Mage_Adminhtml_Model_System_Config_Backend_Seo_Product to change product indexer status, after changing product_use_categories. https://github.com/OpenMage/magento-lts/pull/1405/commits/e33773c8c1e295271457c66bb14cf682b53f4d5b

MarcinNowakMacopedia avatar Aug 19 '22 11:08 MarcinNowakMacopedia

@MarcinNowakMacopedia @tmotyl could you please check the conflicts?

fballiano avatar Aug 28 '22 13:08 fballiano

@fballiano I resolved conflicts.

MarcinNowakMacopedia avatar Aug 29 '22 12:08 MarcinNowakMacopedia

@colinmollenhour thanks. lgtm but waiting for more tests.

sreichel avatar Jan 19 '23 05:01 sreichel

@MarcinNowakMacopedia Are you able to look at my PR https://github.com/macopedia/magento-lts/pull/8? I am not able to push directly to your branch.

colinmollenhour avatar Jan 19 '23 17:01 colinmollenhour

I'll close and reopen this PR since it's not "editable" by admins we can't fix anything in order to finish it.

fballiano avatar May 16 '23 15:05 fballiano