magento2 icon indicating copy to clipboard operation
magento2 copied to clipboard

Very slow performance for \Magento\Catalog\Api\CategoryLinkRepositoryInterface::save

Open ioweb-gr opened this issue 1 year ago • 8 comments

Preconditions and environment

  • 2.4.1 to 2.4.6

Steps to reproduce

  • Create a large profile with about 700 categories, 6 websites, 14 store views and 200.000 products
  • Create a message queue system that will assign / remove products from the categories
  • Populate the message queue with around 1k messages
  • Set the max message size to something normal like 100
  • Start your consumer and let it run with a logger for checking the time

You will get a result like this

2023-07-08 16:56:48 - AssignSaleCategoryHandler: add GW2990 1119
2023-07-08 16:56:59 - AssignSaleCategoryHandler: add GW2991 1119
2023-07-08 16:57:10 - AssignSaleCategoryHandler: add S20678-16 1119
2023-07-08 16:59:25 - AssignSaleCategoryHandler: add 3024725-003 1119
2023-07-08 16:59:34 - AssignSaleCategoryHandler: add 1WT21015-200 1119
2023-07-08 16:59:41 - AssignSaleCategoryHandler: add 1WT21015-001 1119
2023-07-08 16:59:47 - AssignSaleCategoryHandler: add 1KW21010-365 1119
2023-07-08 17:00:08 - AssignSaleCategoryHandler: add 1WT21018-200 1119
2023-07-08 17:00:15 - AssignSaleCategoryHandler: add 1WT21018-001 1119
2023-07-08 17:00:23 - AssignSaleCategoryHandler: add 1WT21018-100 1119
2023-07-08 17:00:30 - AssignSaleCategoryHandler: add 1AF21022-336 1119
2023-07-08 17:02:41 - AssignSaleCategoryHandler: add 1AF21031-100 1119
2023-07-08 17:04:50 - AssignSaleCategoryHandler: add 1AF21034-330 1119
2023-07-08 17:07:00 - AssignSaleCategoryHandler: add S20729-16 1119
2023-07-08 17:09:10 - AssignSaleCategoryHandler: add 2400005-15011 1119
2023-07-08 17:09:18 - AssignSaleCategoryHandler: add 2400005-16011 1119
2023-07-08 17:09:25 - AssignSaleCategoryHandler: add 2400005-19010 1119
2023-07-08 17:09:32 - AssignSaleCategoryHandler: add N2400002-13013 1119
2023-07-08 17:09:38 - AssignSaleCategoryHandler: add N2400002-16011 1119
2023-07-08 17:09:45 - AssignSaleCategoryHandler: add N2400002-19010 1119
2023-07-08 17:09:52 - AssignSaleCategoryHandler: add DD1579-101 1119
2023-07-08 17:10:05 - AssignSaleCategoryHandler: add 3024877-003 1119
2023-07-08 17:10:16 - AssignSaleCategoryHandler: add CZ5478-100 1119
2023-07-08 17:10:21 - AssignSaleCategoryHandler: add S20689-16 1119
2023-07-08 17:13:12 - AssignSaleCategoryHandler: add FFM0060-60002 1119

As you can see it processed like 30 products in > 15 minutes

The main issue here is that the function

    /**
     * @inheritdoc
     */
    public function save(\Magento\Catalog\Api\Data\CategoryProductLinkInterface $productLink)
    {
        $category = $this->categoryRepository->get($productLink->getCategoryId());
        $product = $this->productRepository->get($productLink->getSku());
        $productPositions = $category->getProductsPosition();
        $productPositions[$product->getId()] = $productLink->getPosition();
        $category->setPostedProducts($productPositions);
        try {
            $category->save();
        } catch (\Exception $e) {
            throw new CouldNotSaveException(
                __(
                    'Error: "%1"',
                    $e->getMessage()
                ),
                $e
            );
        }
        return true;
    }

Will actually resave the whole category every time you assign a product to it including all the product positions and everything.

image image

In my case as you can see this is triggering observers for rewriting the urls again and it's causing the delay.

Obviously this speed is not acceptable in terms of performance rendering the code unusable.

In our use case we have to manually handle more than 50k messages / day because when products change their price we need to refresh specific categories with them by deleting or adding them to the category.

Moreover by the time these messages are processed more will be added infinitely in a never ending loop of unfinished message processing.

Sample message handler

<?php
/**
 * Copyright (c) 2023. IOWEB TECHNOLOGIES
 */

namespace Ioweb\SaleCategories\Model\Queue\Handler;

use Ioweb\SaleCategories\Api\Data\AssignSaleCategoryMessageInterface;
use Ioweb\SaleCategories\Service\Logger;
use Magento\Catalog\Api\CategoryLinkManagementInterface;
use Magento\Catalog\Api\CategoryLinkRepositoryInterface;

class AssignSaleCategoryHandler
{
    private CategoryLinkManagementInterface $categoryLinkManagement;
    private CategoryLinkRepositoryInterface $categoryLinkRepository;
    private Logger $logger;

    public function __construct(
        CategoryLinkManagementInterface $categoryLinkManagement,
        CategoryLinkRepositoryInterface $categoryLinkRepository,
        Logger $logger
    )
    {
        $this->categoryLinkManagement = $categoryLinkManagement;
        $this->categoryLinkRepository = $categoryLinkRepository;
        $this->logger = $logger;
    }

    /**
     * @param AssignSaleCategoryMessageInterface $message
     * @return string
     */
    public function execute($message)
    {
        $this->logger->info('AssignSaleCategoryHandler: ' . $message->getAction() . ' ' . $message->getSku() . ' ' . $message->getCategoryId());
        switch($message->getAction()){
            case AssignSaleCategoryMessageInterface::ACTION_ADD:
                $this->categoryLinkManagement->assignProductToCategories(
                    $message->getSku(),
                    [$message->getCategoryId()]
                );
                break;
            case AssignSaleCategoryMessageInterface::ACTION_REMOVE:
                $this->categoryLinkRepository->deleteByIds(
                    $message->getCategoryId(),
                    $message->getSku()
                );
                break;
        }
        return 'complete';
    }
}

Expected result

Assignment is really fast.

Actual result

Assignment is unexpectedly really slow.

Additional information

I've opened that back in 2021 and it was dismissed and closed because I wasn't using MessageQueues. Well here we are in 2023 with messageQueues and the underlying problem is still there. Is there any proposed alternative?

Release note

No response

Triage and priority

  • [X] Severity: S0 - Affects critical data or functionality and leaves users without workaround.
  • [ ] Severity: S1 - Affects critical data or functionality and forces users to employ a workaround.
  • [ ] Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
  • [ ] Severity: S3 - Affects non-critical data or functionality and does not force users to employ a workaround.
  • [ ] Severity: S4 - Affects aesthetics, professional look and feel, “quality” or “usability”.

ioweb-gr avatar Jul 08 '23 17:07 ioweb-gr

Hi @ioweb-gr. Thank you for your report. To speed up processing of this issue, make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, Add a comment to the issue:


Join Magento Community Engineering Slack and ask your questions in #github channel. :warning: According to the Magento Contribution requirements, all issues must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting. :clock10: You can find the schedule on the Magento Community Calendar page. :telephone_receiver: The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, join the Community Contributions Triage session to discuss the appropriate ticket.

m2-assistant[bot] avatar Jul 08 '23 17:07 m2-assistant[bot]

I would like to add to this that the whole problem is basically the way you are handling URL rewrites in the system.

This system is proving too resource intensive to maintain the rewrites and I think it should be rewritten from scratch.

Especially the setting to append the product url to the category url path seems to be the main culprit.

Since the paths to a category are pretty much static for all products in the category I don't understand why the database table itself needs to maintain a reference to all product paths including the category path when you can just fetch the path for the category and render it. Internally the system is translating it to ids anyways so there should be a way to reverse the path after presenting it without having to store the actual path in the db every single time a product or category is saved. Even wordpress does that and performs better with 150k products and 700 categories in this aspect.

ioweb-gr avatar Jul 09 '23 11:07 ioweb-gr

You might take a look at my PR https://github.com/magento/magento2/pull/34226 I suggested replacement of repo for simple insert on duplicate, migth fix some performance

ilnytskyi avatar Aug 10 '23 11:08 ilnytskyi

I don't see how it relates. As long as you trigger the category save whether via repository or else how it will still invoke the same observers and plugins and will follow the same path of rebuilding all the url rewrites for the products in the category including all the position recalculations.

It doesn't seem to solve the performance issue mentioned here. Could you shed some light as to what it affects in relation to this issue ?

ioweb-gr avatar Aug 10 '23 12:08 ioweb-gr

Is there a workaround for this issue? We have a store with over 500,000 products and 10,000 categories. We recently revamped the whole category tree and we have to move products around but each single update takes ages so it’ll take months to apply all the changes.

bfontaine avatar Nov 21 '23 10:11 bfontaine

We were in the same predicament @bfontaine but no one from Adobe has taken interest in this :(

You can try manually creating and saving the links via the repository directly in the table. It will break some URLs but a tool for regenerating URL rewrites can help you fix it

ioweb-gr avatar Nov 21 '23 11:11 ioweb-gr

Hi @engcom-Hotel. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

  • [ ] 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).
  • [ ] 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue.
  • [ ] 3. Add Area: XXXXX label to the ticket, indicating the functional areas it may be related to.
  • [ ] 4. Verify that the issue is reproducible on 2.4-develop branch
    Details- Add the comment @magento give me 2.4-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.4-develop branch, please, add the label Reproduced on 2.4.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!
  • [ ] 5. Add label Issue: Confirmed once verification is complete.
  • [ ] 6. Make sure that automatic system confirms that report has been added to the backlog.

m2-assistant[bot] avatar Feb 19 '24 05:02 m2-assistant[bot]

@engcom-Hotel do you think you could take another look at this to see if we can find a workaround for this with the core developers? Surely there must be a way to fix this slowness :(

ioweb-gr avatar Feb 29 '24 20:02 ioweb-gr

The only solution I found for this very annoying issue was to temporarily set all indices to "Update on schedule". After I'm done re-assigning products to categories, I re-index all indices and then set their update rule to whatever it was before.

mobweb avatar May 17 '24 05:05 mobweb

We're permanently on schedule mode. It doesn't solve the slowness as url rewrites and positions are still rebuilt every single time leading to huge delays and timeouts

ioweb-gr avatar May 17 '24 06:05 ioweb-gr

Hello @ioweb-gr,

Thanks for the report and collaboration!

We have tried to reproduce the issue in the latest 2.4-develop branch, but the issue is not reproducible for us. In our instance, we have almost 4 Lakhs records with 700+ categories. Please refer to the below screenshot for reference:

Product Listing image

Category Listing image

And below is the screenshot of the consumer processing the messages:

image

For us, the updation is happening in milliseconds. Please refer to the below module for reference:

https://github.com/engcom-Hotel/magento2/tree/issue37739/app/code/Magz/Categoryproductassign

Let us know if we missed anything.

Thanks

engcom-Hotel avatar Jul 29 '24 11:07 engcom-Hotel

We plan to upgrade the website on end of August so we can try it on latest version as well. Maybe the issue has been fixed with latest version but I can't confirm just yet. I'll get back to you once I can retry it.

ioweb-gr avatar Aug 03 '24 07:08 ioweb-gr