immer icon indicating copy to clipboard operation
immer copied to clipboard

Override array methods to avoid proxy creation while iterating and updating

Open markerikson opened this issue 2 months ago • 5 comments

Stacked on top of #1164 (perf tweaks) and #1183 (finalization callbacks).

This PR:

  • adds a set of interception overrides for both mutating and non-mutating array methods that operate directly on the underlying copy, in order to avoid creating proxies every time we read and write to an array index.

Implementation

We have a known list of both mutating and non-mutating array methods in proxy.ts. We check field access in the get trap, and if it's one of the known array methods, track which method to indicate we're in an array operation.

We then manually implement the operation by calling the method on the underlying state.copy_, or in some cases reimplementing some of the behavior. For example, filter does a manual iteration and calls the predicate with the existing values to avoid accessing the draft array and creating proxies for every accessed field. But, for consistency and further updates, it does then do result.push(state.draft[i]) every time the predicate returns true. That means we avoid some proxy creation overhead and only create proxies for values included in the result array. For methods like sort and reverse, we mutate the copy_ array, and manually set the assigned_ indices, avoiding any access to the drafted array and thus no proxy creation for fields.

This gives a significant speedup for almost all array benchmark scenarios.

Note that for filter / find etc, we're now passing likely-raw values into the callbacks. If the user were to mutate those, they'd be really mutated and we wouldn't prevent or pick up on that. Since those are semantically read-only operations and you shouldn't be mutating in those callbacks in the first place, this seems like a reasonable tradeoff to me.

I've opted to not override map / flatMap / reduce / reduceRight. Those can return arbitrary values, and also added noticeably more bundle size from implementation. I decided it wasn't worth the complexity to optimize those methods.

Performance

Running the perf benchmarks with this PR, I get:

┌─────────────────────┬──────────────┬──────────────┬─────────────┐
│ Scenario            │ immer10      │ immer10Perf  │ Improvement │
├─────────────────────┼──────────────┼──────────────┼─────────────┤
│ reverse-array       │      117.5µs │       14.4µs │      +87.7% │
│ sortById-reverse    │      128.0µs │       17.7µs │      +86.2% │
│ remove-high         │      119.6µs │       16.9µs │      +85.8% │
│ remove              │      128.4µs │       20.9µs │      +83.7% │
│ update-high         │       89.7µs │       14.7µs │      +83.7% │
│ filter              │       49.1µs │       11.7µs │      +76.1% │
│ remove-reuse        │        4.1ms │        1.2ms │      +69.5% │
│ remove-high-reuse   │        3.9ms │        1.2ms │      +69.4% │
│ update-reuse        │        3.5ms │        1.3ms │      +63.6% │
│ mixed-sequence      │        3.5ms │        1.3ms │      +61.8% │
│ update-high-reuse   │        3.3ms │        1.4ms │      +59.3% │
│ concat              │      117.1µs │       51.2µs │      +56.3% │
│ rtkq-sequence       │       12.5ms │        7.0ms │      +44.1% │
│ updateLargeObject   │      235.7µs │      132.1µs │      +44.0% │
│ update-multiple     │       47.1µs │       27.9µs │      +40.8% │
│ updateLargeObject-r │        9.6ms │        6.0ms │      +37.6% │
│ add                 │       24.4µs │       17.0µs │      +30.4% │
│ update              │       22.7µs │       18.6µs │      +18.1% │
│ mapNested           │      120.0µs │      128.8µs │       -7.3% │
└─────────────────────┴──────────────┴──────────────┴─────────────┘

✓ immer10Perf shows an average 57.4% performance improvement over immer10

That's a very sizeable 30%+ increase over #1183 , which is not surprising given that all the array behaviors suddenly have significantly less overhead.

Bundle Size

Per #1183, the series of PRs does increase bundle size noticeably:

Eyeballing bundle sizes, this PR increases the immer.production.js minified bundle size by another 1-1.5K on top of #1183, from ~14K to ~15K. If I build a Vite app and measure using Sonda, actual size in a built app appears to have grown from:

  • v10.0.3 (which is out of date but what my branches were based from): 7.52K
  • The original set of perf tweaks: 8.11K (which may include some of the other recent PRs that were merged - not sure my changes would have added that much)
  • #1183 with callbacks: 9.38K
  • This PR: 11.17K

As I said, I'm very sensitive to bundle size changes. I have been assuming we would probably want to make this PR into a new plugin, so that users can opt-in to whether they want the extra bundle size to gain better array perf.

Right now the plugin system only has methods for loading and getting a plugin. We'd probably want to add a util to see if a plugin is currently loaded, roughly like const isPluginLoaded = (name: string) => !!plugins[name]. That way we can do a quick check inside of get. Alternately, we could have proxy.ts expose a setter to indicate this plugin is available, and have the plugin logic update that flag when this plugin is loaded.

markerikson avatar Oct 15 '25 15:10 markerikson

Pull Request Test Coverage Report for Build 20213871328

Details

  • 643 of 1245 (51.65%) changed or added relevant lines in 13 files are covered.
  • 8 unchanged lines in 4 files lost coverage.
  • Overall coverage decreased (-2.1%) to 42.875%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/plugins/arrayMethods.ts 182 183 99.45%
src/core/finalize.ts 149 151 98.68%
src/core/proxy.ts 47 50 94.0%
src/utils/common.ts 80 83 96.39%
src/plugins/patches.ts 87 99 87.88%
perf-testing/immutability-benchmarks.mjs 0 19 0.0%
perf-testing/immutability-profiling.mjs 0 562 0.0%
<!-- Total: 643 1245
Files with Coverage Reduction New Missed Lines %
src/plugins/patches.ts 1 94.9%
src/core/finalize.ts 2 96.81%
src/core/immerClass.ts 2 98.69%
src/utils/errors.ts 3 87.04%
<!-- Total: 8
Totals Coverage Status
Change from base Build 18857343219: -2.1%
Covered Lines: 1678
Relevant Lines: 4771

💛 - Coveralls

coveralls avatar Oct 15 '25 15:10 coveralls

I'd say the plugin system is still worth it for the savings in bundle size, so it's a question of whether this ought to be a plugin or not. I'd actually lean towards making it a plugin even in an 11.0 major, and we can always convert it to being built-in later.

markerikson avatar Oct 25 '25 16:10 markerikson

Updates:

  • Applied some of the PR review changes, like adding a assignedAllIndices flag and improving isArrayIndex
  • Converted the array method overrides into a new plugin activated by enableArrayMethods()
    • added a scope field for the arrayMethodsPlugin, checked at scope creation time
    • the get trap uses that so we don't have to keep re-checking for plugin existence
  • Updated tests/base.js so that it runs all the tests without the array plugin, then runs one set of tests with the plugin (under the assumption that the array changes are orthogonal to the other options like freezing). So, we now have 9 total runBaseTests passes executing
  • Tossed in an immutability-profiling.mjs script I'd added, which has all the same scenarios as the immutability-benchmarks.mjs, but in a standalone file for better CPU profiling
  • Updated both benchmarking scripts to enable the array plugin

Performance

Current results on this branch, with enableArrayMethods() turned on:

✓ immer10Perf shows an average 40.4% performance improvement over immer10

┌─────────────────────┬──────────────┬──────────────┬─────────────┐
│ Scenario            │ immer10      │ immer10Perf  │ Improvement │
├─────────────────────┼──────────────┼──────────────┼─────────────┤
│ reverse-array       │      250.7µs │       65.5µs │      +73.9% │
│ remove-high         │      197.5µs │       54.3µs │      +72.5% │
│ sortById-reverse    │      189.3µs │       52.8µs │      +72.1% │
│ update-high         │      182.6µs │       57.6µs │      +68.5% │
│ remove              │      188.8µs │       73.8µs │      +60.9% │
│ update-high-reuse   │        9.6ms │        4.3ms │      +54.7% │
│ mixed-sequence      │        8.2ms │        3.8ms │      +54.5% │
│ update-reuse        │       12.4ms │        5.7ms │      +53.7% │
│ remove-reuse        │        9.0ms │        4.4ms │      +50.9% │
│ remove-high-reuse   │        9.3ms │        4.7ms │      +49.9% │
│ filter              │      107.6µs │       57.1µs │      +46.9% │
│ update-largeObject1 │       14.1ms │        8.8ms │      +37.8% │
│ concat              │      199.9µs │      124.4µs │      +37.8% │
│ rtkq-sequence       │       28.4ms │       19.7ms │      +30.5% │
│ update-largeObject2 │       29.5ms │       20.8ms │      +29.4% │
│ update-largeObject1 │      691.9µs │      542.1µs │      +21.6% │
│ update-largeObject2 │        2.2ms │        1.8ms │      +18.9% │
│ update-multiple     │       83.9µs │       71.9µs │      +14.3% │
│ update              │       70.7µs │       64.3µs │       +9.0% │
│ add                 │       80.5µs │       76.3µs │       +5.1% │
│ mapNested           │      195.8µs │      225.8µs │      -15.3% │
└─────────────────────┴──────────────┴──────────────┴─────────────┘

I am actually very slightly confused how we went from "55%" on my earlier prototype branch, to "40%" here. I assume that some of it is that I added more benchmark scenarios like largeObject2 individual and reuse, and having a couple more scenarios that are +20% lowers the average. Feels like there might be a bit more in terms of the actual runtime speed, but I haven't pinpointed any differences yet.

If I do a pprof profile of that immutability-profiling.mjs multi-scenario script, the overall results look like this:

image

It seems that the real bottlenecks are actually Object.isFrozen/freeze() and {...base}:

shallowCopy

  Total:       1.77s      1.77s (flat, cum) 16.59%
    173            .          .               } 
    174            .          .               return O.create(getPrototypeOf(base), descriptors); 
    175            .          .             } else { 
    176            .          .               const proto = getPrototypeOf(base); 
    177            .          .               if (proto !== null && isPlain) { 
    178        1.77s      1.77s                 return { ...base }; 
    179            .          .               } 
    180            .          .               const obj = O.create(proto); 
    181            .          .               return O.assign(obj, base); 
    182            .          .             } 
    183            .          .           } 
freeze
  Total:       1.01s     13.54s (flat, cum) 126.67%
    180            .          .               const obj = O.create(proto); 
    181            .          .               return O.assign(obj, base); 
    182            .          .             } 
    183            .          .           } 
    184            .          .           function freeze(obj, deep = false) { 
    185       3.63ms      1.32s             if (isFrozen(obj) || isDraft(obj)) 
    186            .          .               return obj; 
    187       1.21ms    37.51ms             if (getArchtype(obj) > 1) { 
    188            .          .               O.defineProperties(obj, { 
    189            .          .                 set: dontMutateMethodOverride, 
    190            .          .                 add: dontMutateMethodOverride, 
    191            .          .                 clear: dontMutateMethodOverride, 
    192            .          .                 delete: dontMutateMethodOverride 
    193            .          .               }); 
    194            .          .             } 
    195     719.95ms   719.95ms             O.freeze(obj); 
    196            .          .             if (deep) 
    197     263.78ms     11.42s               each( 
    198      20.57ms    38.72ms                 obj, 
    199            .          .                 (_key, value) => { 
    200       3.63ms     3.63ms                   freeze(value, true); 
    201            .          .                 }, 
    202            .          .                 false 
    203            .          .               ); 
    204            .          .             return obj; 
    205            .          .           } 

Given that, I'm not sure there's much more to squeeze out of this without changing fundamental assumptions like freezing by default.

markerikson avatar Oct 30 '25 21:10 markerikson

Given that, I'm not sure there's much more to squeeze out of this without changing fundamental assumptions like freezing by default.

Freezing was enabled by default as a perf gain: in the tree diff approach it would help skip diffing entire subtrees in subsequent produce runs. It is also great that it helps with correctness, but if it helps perf we can swap the default (or only enable freezing in DEV scenarios to still catch buggy code early)

mweststrate avatar Nov 23 '25 10:11 mweststrate

Yeah, that's a good question, actually - now that we're not always recursing by default, is there a chance skipping freezing would be valid?

There's still several places in the new finalization logic that are still checking isFrozen to bail out, so that may still be enough of an embedded assumption it would take major changes. Might be worth investigating, though.

markerikson avatar Nov 23 '25 17:11 markerikson

Ok I think the current release has soaked long enough and I'm happy to proceed with this one.

Note that some documentation updates are required as well, let me know if you want me to look into those @markerikson

mweststrate avatar Dec 11 '25 18:12 mweststrate

Awesome! Yeah, I'll tackle it this weekend.

markerikson avatar Dec 12 '25 02:12 markerikson

Okay, updated the PR:

  • Rebased vs latest main and removed the stacking from the earlier PRs
  • Spent some time thinking through the relative behavior of concat vs filter. Concluded that it makes sense for filter to return drafts for the included array items, because there's still a natural connection to the parent array (and thus the parent/child mutation tracking works). For concat, though, you end up with a potential mix of items. So, currently concat doesn't try to do any forced drafting at all, it's just whatever was in the underlying array.
  • I saw that we didn't have tests for the remaining array methods, so I generated a bunch of those. The generated tests have a conditional pair of tests for concat and flat that check "includes drafts" vs "includes base values" based on whether we're running base.js with the array plugin enabled
  • Added AI-written JSDocs to the array methods implementation, and an AI-written docs page for the plugin. I am highly skeptical of AI-generated writing in general... but I reviewed all of this and honestly it seems reasonable? I corrected a few mis-statements, but overall I actually stand behind the docs content as written here. Seems pretty comprehensive, readable, and captures the tradeoffs in behavior. Opus 4.5 is pretty good here.
  • Updated the test:size command to include enableArrayMethods and updated the import size results table

markerikson avatar Dec 14 '25 21:12 markerikson