solidus icon indicating copy to clipboard operation
solidus copied to clipboard

[RFC] Just touch taxons in Product#touch_taxons

Open tvdeyen opened this issue 3 years ago • 6 comments

Description

And make use of the after_touch callback of Spree::Taxon, that already handles touching ancestors and its taxonomy.

This hopefully fixes #3931 as well, because it runs each touch callback in it's own transaction instead of one large transaction that might cause dead locks.

This IS slower than the optimized approach before, but this should not be that much of an issue, since we are updating a product here and this will a) not happen that much and b) there will not be many taxons one single product will be attached to, right? Even in large stores this will most likely be in single to double digits, instead of hundreds or thousands. But I might be painfully wrong here.

This also has the advantage that a after_touch callback a store or an extension might have will be triggered and caches will be invalidated correctly.

~~DISCLAIMER: I did not added any tests on purpose, because I want to discuss if this is an approach we want to try first.~~

~~Actually I added a spec that creates a taxon tree of 1000 taxons nested 10 random levels deep. The product that then will be updated is attached to 50 random taxons. We expect no errors.~~

Removed that spec again

Checklist:

  • [x] I have followed Pull Request guidelines
  • [x] I have added a detailed description into each commit message
  • [x] I have added tests to cover this change

tvdeyen avatar Jan 07 '22 19:01 tvdeyen

@tvdeyen Thanks for taking care of this!

I'm seeing the test suite for this PR is taking ~1 hour compared to the ~10 minutes of the other builds. Even if the single spec added takes ~5 seconds, I'm concerned that this change adds a significant amount of time by getting performance worse on every single test that interacts with products and taxons. If that's the case, I don't think we should merge this as is.

On a side note, did you consider providing this new behavior optional with a preference? Maybe we could have the past approach as default, with optimization in mind and the new one, more reliable in terms of deadlock as optional.

I know of stores that have products constantly updated by external tools (for content or inventory management sync), in these cases we will trigger a lot of these touches and I'm afraid that this version would significantly slow down their operations (assuming the current version is barely acceptable for them, which I'm not sure).

kennyadsl avatar Jan 10 '22 08:01 kennyadsl

I'm seeing the test suite for this PR is taking ~1 hour compared to the ~10 minutes of the other builds. Even if the single spec added takes ~5 seconds, I'm concerned that this change adds a significant amount of time by getting performance worse on every single test that interacts with products and taxons. If that's the case, I don't think we should merge this as is.

@kennyadsl The spec that I added took 5647.54 (1.5h) to run under postgresql on CircleCI. In mysql it finished in 587.86s (9 minutes). From https://app.circleci.com/pipelines/github/solidusio/solidus/2926/workflows/c1e0911a-07a7-4e2b-b4b8-23af32c3e60f/jobs/27313

Top 10 slowest example groups:
  Spree::Product
    5950.8 seconds average (5950.8 seconds / 1 example) ./spec/models/spree/product_touch_taxons_spec.rb:3

Most of the time is spent creating 1000 taxons, nested 10 levels deep. The actual spec of updating 50 taxons (and their ancestors) runs in 8ms on my machine.

Disclaimer: I never intended to keep the spec. I just wanted to verify it does not cause any deadlocks even with unrealistically high numbers of taxons. Locally this spec runs with postgres in under 10 seconds. I am not sure why this takes so long on CircleCI.

On a side note, did you consider providing this new behavior optional with a preference? Maybe we could have the past approach as default, with optimization in mind and the new one, more reliable in terms of deadlock as optional.

I thought about this, yes. But is this actually necessary? See the note below.

I know of stores that have products constantly updated by external tools (for content or inventory management sync), in these cases we will trigger a lot of these touches and I'm afraid that this version would significantly slow down their operations (assuming the current version is barely acceptable for them, which I'm not sure).

My test shows that this should not have an impact on these stores, because even though we touch a lot of products, how many taxons are these products attached to? 1, 2, 10? My test assigns a product to 50 taxons (very unlikely for real life stores IMO) and it takes ~8ms to touch these 50 taxons (and their 10 ancestors).

0.000748   0.000021   0.000769 (  0.000764)

These numbers are from an 8 core Apple M1 Pro with 32GB of RAM. But even if they would be 10 x slower it still would only take 80ms to update 50 taxons. Most products I know of are assigned to 1-5 taxons.

If we are still concerned (and honestly the CircleCI times concern me), we could add the new behavior as an opt-in. For my use-case this would be fine. But shouldn't we ship "reliable" as the default option then? And fast/unreliable as opt-in?

This looks good, but I'd still like to move the touch_taxons call to an after_commit. I don't believe it must be in a transaction.

@jarednorman I agree, but does a call to product.touch also triggers an after_commit?

tvdeyen avatar Jan 13 '22 22:01 tvdeyen

I would assume it does, but we can check.

jarednorman avatar Jan 17 '22 19:01 jarednorman

I've confirmed that after_commit callbacks run on touch.

jarednorman avatar Jan 24 '22 04:01 jarednorman

Hey @tvdeyen! I think we can try to move this forward now. What about rebasing to try to have the test suite green? After we verified that there are no deadlocks, we can remove the slow "benchmark" specs and merge the PR. Sounds good?

kennyadsl avatar Aug 23 '22 08:08 kennyadsl