stretch icon indicating copy to clipboard operation
stretch copied to clipboard

Fix performance issue with deep nesting

Open Davier opened this issue 4 years ago • 1 comments

The performance issue in #71 is due to the layout cache being constantly invalidated. It is solved by having a separate cache for the recursive call in section "3. Determine the flex base size and hypothetical main size of each item"

This fix makes one test fail (but it makes sense, each cache measure once):

failures:

---- measure::only_measure_once stdout ----
thread 'measure::only_measure_once' panicked at 'assertion failed: `(left == right)`
  left: `2`,
 right: `1`', tests/measure.rs:564:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Running the example code from #71 in debug mode gives:

elapsed: 3 nodes -> Ok(82.775µs)
elapsed: 4 nodes -> Ok(266.067µs)
elapsed: 5 nodes -> Ok(145.019µs)
elapsed: 6 nodes -> Ok(164.674µs)
elapsed: 7 nodes -> Ok(171.574µs)
elapsed: 8 nodes -> Ok(212.668µs)
elapsed: 9 nodes -> Ok(311.203µs)
elapsed: 10 nodes -> Ok(293.888µs)
elapsed: 11 nodes -> Ok(419.703µs)
elapsed: 12 nodes -> Ok(393.277µs)
elapsed: 13 nodes -> Ok(497.327µs)
elapsed: 14 nodes -> Ok(434.829µs)
elapsed: 15 nodes -> Ok(488.549µs)
elapsed: 16 nodes -> Ok(686.627µs)
elapsed: 17 nodes -> Ok(870.066µs)
elapsed: 18 nodes -> Ok(523.542µs)
elapsed: 19 nodes -> Ok(550.827µs)
elapsed: 20 nodes -> Ok(902.458µs)
elapsed: 21 nodes -> Ok(753.408µs)
elapsed: 22 nodes -> Ok(659.295µs)

and in release:

elapsed: 3 nodes -> Ok(6.975µs)
elapsed: 4 nodes -> Ok(3.673µs)
elapsed: 5 nodes -> Ok(24.425µs)
elapsed: 6 nodes -> Ok(5.325µs)
elapsed: 7 nodes -> Ok(6.222µs)
elapsed: 8 nodes -> Ok(7.28µs)
elapsed: 9 nodes -> Ok(21.734µs)
elapsed: 10 nodes -> Ok(9.535µs)
elapsed: 11 nodes -> Ok(12.769µs)
elapsed: 12 nodes -> Ok(11.924µs)
elapsed: 13 nodes -> Ok(12.973µs)
elapsed: 14 nodes -> Ok(13.97µs)
elapsed: 15 nodes -> Ok(15.103µs)
elapsed: 16 nodes -> Ok(15.874µs)
elapsed: 17 nodes -> Ok(20.037µs)
elapsed: 18 nodes -> Ok(18.679µs)
elapsed: 19 nodes -> Ok(26.416µs)
elapsed: 20 nodes -> Ok(35.372µs)
elapsed: 21 nodes -> Ok(21.476µs)
elapsed: 22 nodes -> Ok(22.73µs)

The exponential behaviour is clearly gone, and the measured time saturates since most caches from the previous iterations are reused.

Davier avatar Jan 07 '21 00:01 Davier

While this PR fixes one issue with cache invalidation, there is still another issue causing exponential recursion. I'll detail it here since I haven't been able to solve it yet. The culprit seems to be the recursive call in the "Cross Size Determination". While experimenting on forcing the use of cached results, it reduced the number of recursion by 4 orders of magnitude (15'207'071 to 1'560 to be precise) in a test with about 20 nested containers. But doing that broke the cross size in fit-content mode (you can see it in this branch: https://github.com/Davier/stretch/commits/wip). I'm sure there is a way to solve that issue, but it may need more than my basic understanding of the algorithm.

Davier avatar Feb 03 '21 12:02 Davier