Override array methods to avoid proxy creation while iterating and updating
Stacked on top of #1164 (perf tweaks) and #1183 (finalization callbacks).
This PR:
- adds a set of interception overrides for both mutating and non-mutating array methods that operate directly on the underlying copy, in order to avoid creating proxies every time we read and write to an array index.
Implementation
We have a known list of both mutating and non-mutating array methods in proxy.ts. We check field access in the get trap, and if it's one of the known array methods, track which method to indicate we're in an array operation.
We then manually implement the operation by calling the method on the underlying state.copy_, or in some cases reimplementing some of the behavior. For example, filter does a manual iteration and calls the predicate with the existing values to avoid accessing the draft array and creating proxies for every accessed field. But, for consistency and further updates, it does then do result.push(state.draft[i]) every time the predicate returns true. That means we avoid some proxy creation overhead and only create proxies for values included in the result array. For methods like sort and reverse, we mutate the copy_ array, and manually set the assigned_ indices, avoiding any access to the drafted array and thus no proxy creation for fields.
This gives a significant speedup for almost all array benchmark scenarios.
Note that for filter / find etc, we're now passing likely-raw values into the callbacks. If the user were to mutate those, they'd be really mutated and we wouldn't prevent or pick up on that. Since those are semantically read-only operations and you shouldn't be mutating in those callbacks in the first place, this seems like a reasonable tradeoff to me.
I've opted to not override map / flatMap / reduce / reduceRight. Those can return arbitrary values, and also added noticeably more bundle size from implementation. I decided it wasn't worth the complexity to optimize those methods.
Performance
Running the perf benchmarks with this PR, I get:
┌─────────────────────┬──────────────┬──────────────┬─────────────┐
│ Scenario │ immer10 │ immer10Perf │ Improvement │
├─────────────────────┼──────────────┼──────────────┼─────────────┤
│ reverse-array │ 117.5µs │ 14.4µs │ +87.7% │
│ sortById-reverse │ 128.0µs │ 17.7µs │ +86.2% │
│ remove-high │ 119.6µs │ 16.9µs │ +85.8% │
│ remove │ 128.4µs │ 20.9µs │ +83.7% │
│ update-high │ 89.7µs │ 14.7µs │ +83.7% │
│ filter │ 49.1µs │ 11.7µs │ +76.1% │
│ remove-reuse │ 4.1ms │ 1.2ms │ +69.5% │
│ remove-high-reuse │ 3.9ms │ 1.2ms │ +69.4% │
│ update-reuse │ 3.5ms │ 1.3ms │ +63.6% │
│ mixed-sequence │ 3.5ms │ 1.3ms │ +61.8% │
│ update-high-reuse │ 3.3ms │ 1.4ms │ +59.3% │
│ concat │ 117.1µs │ 51.2µs │ +56.3% │
│ rtkq-sequence │ 12.5ms │ 7.0ms │ +44.1% │
│ updateLargeObject │ 235.7µs │ 132.1µs │ +44.0% │
│ update-multiple │ 47.1µs │ 27.9µs │ +40.8% │
│ updateLargeObject-r │ 9.6ms │ 6.0ms │ +37.6% │
│ add │ 24.4µs │ 17.0µs │ +30.4% │
│ update │ 22.7µs │ 18.6µs │ +18.1% │
│ mapNested │ 120.0µs │ 128.8µs │ -7.3% │
└─────────────────────┴──────────────┴──────────────┴─────────────┘
✓ immer10Perf shows an average 57.4% performance improvement over immer10
That's a very sizeable 30%+ increase over #1183 , which is not surprising given that all the array behaviors suddenly have significantly less overhead.
Bundle Size
Per #1183, the series of PRs does increase bundle size noticeably:
Eyeballing bundle sizes, this PR increases the immer.production.js minified bundle size by another 1-1.5K on top of #1183, from ~14K to ~15K. If I build a Vite app and measure using Sonda, actual size in a built app appears to have grown from:
- v10.0.3 (which is out of date but what my branches were based from): 7.52K
- The original set of perf tweaks: 8.11K (which may include some of the other recent PRs that were merged - not sure my changes would have added that much)
- #1183 with callbacks: 9.38K
- This PR: 11.17K
As I said, I'm very sensitive to bundle size changes. I have been assuming we would probably want to make this PR into a new plugin, so that users can opt-in to whether they want the extra bundle size to gain better array perf.
Right now the plugin system only has methods for loading and getting a plugin. We'd probably want to add a util to see if a plugin is currently loaded, roughly like const isPluginLoaded = (name: string) => !!plugins[name]. That way we can do a quick check inside of get. Alternately, we could have proxy.ts expose a setter to indicate this plugin is available, and have the plugin logic update that flag when this plugin is loaded.
Pull Request Test Coverage Report for Build 20213871328
Details
- 643 of 1245 (51.65%) changed or added relevant lines in 13 files are covered.
- 8 unchanged lines in 4 files lost coverage.
- Overall coverage decreased (-2.1%) to 42.875%
| Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
|---|---|---|---|
| src/plugins/arrayMethods.ts | 182 | 183 | 99.45% |
| src/core/finalize.ts | 149 | 151 | 98.68% |
| src/core/proxy.ts | 47 | 50 | 94.0% |
| src/utils/common.ts | 80 | 83 | 96.39% |
| src/plugins/patches.ts | 87 | 99 | 87.88% |
| perf-testing/immutability-benchmarks.mjs | 0 | 19 | 0.0% |
| perf-testing/immutability-profiling.mjs | 0 | 562 | 0.0% |
| <!-- | Total: | 643 | 1245 |
| Files with Coverage Reduction | New Missed Lines | % |
|---|---|---|
| src/plugins/patches.ts | 1 | 94.9% |
| src/core/finalize.ts | 2 | 96.81% |
| src/core/immerClass.ts | 2 | 98.69% |
| src/utils/errors.ts | 3 | 87.04% |
| <!-- | Total: | 8 |
| Totals | |
|---|---|
| Change from base Build 18857343219: | -2.1% |
| Covered Lines: | 1678 |
| Relevant Lines: | 4771 |
💛 - Coveralls
I'd say the plugin system is still worth it for the savings in bundle size, so it's a question of whether this ought to be a plugin or not. I'd actually lean towards making it a plugin even in an 11.0 major, and we can always convert it to being built-in later.
Updates:
- Applied some of the PR review changes, like adding a
assignedAllIndicesflag and improvingisArrayIndex - Converted the array method overrides into a new plugin activated by
enableArrayMethods()- added a scope field for the
arrayMethodsPlugin, checked at scope creation time - the
gettrap uses that so we don't have to keep re-checking for plugin existence
- added a scope field for the
- Updated
tests/base.jsso that it runs all the tests without the array plugin, then runs one set of tests with the plugin (under the assumption that the array changes are orthogonal to the other options like freezing). So, we now have 9 totalrunBaseTestspasses executing - Tossed in an
immutability-profiling.mjsscript I'd added, which has all the same scenarios as theimmutability-benchmarks.mjs, but in a standalone file for better CPU profiling - Updated both benchmarking scripts to enable the array plugin
Performance
Current results on this branch, with enableArrayMethods() turned on:
✓ immer10Perf shows an average 40.4% performance improvement over immer10
┌─────────────────────┬──────────────┬──────────────┬─────────────┐
│ Scenario │ immer10 │ immer10Perf │ Improvement │
├─────────────────────┼──────────────┼──────────────┼─────────────┤
│ reverse-array │ 250.7µs │ 65.5µs │ +73.9% │
│ remove-high │ 197.5µs │ 54.3µs │ +72.5% │
│ sortById-reverse │ 189.3µs │ 52.8µs │ +72.1% │
│ update-high │ 182.6µs │ 57.6µs │ +68.5% │
│ remove │ 188.8µs │ 73.8µs │ +60.9% │
│ update-high-reuse │ 9.6ms │ 4.3ms │ +54.7% │
│ mixed-sequence │ 8.2ms │ 3.8ms │ +54.5% │
│ update-reuse │ 12.4ms │ 5.7ms │ +53.7% │
│ remove-reuse │ 9.0ms │ 4.4ms │ +50.9% │
│ remove-high-reuse │ 9.3ms │ 4.7ms │ +49.9% │
│ filter │ 107.6µs │ 57.1µs │ +46.9% │
│ update-largeObject1 │ 14.1ms │ 8.8ms │ +37.8% │
│ concat │ 199.9µs │ 124.4µs │ +37.8% │
│ rtkq-sequence │ 28.4ms │ 19.7ms │ +30.5% │
│ update-largeObject2 │ 29.5ms │ 20.8ms │ +29.4% │
│ update-largeObject1 │ 691.9µs │ 542.1µs │ +21.6% │
│ update-largeObject2 │ 2.2ms │ 1.8ms │ +18.9% │
│ update-multiple │ 83.9µs │ 71.9µs │ +14.3% │
│ update │ 70.7µs │ 64.3µs │ +9.0% │
│ add │ 80.5µs │ 76.3µs │ +5.1% │
│ mapNested │ 195.8µs │ 225.8µs │ -15.3% │
└─────────────────────┴──────────────┴──────────────┴─────────────┘
I am actually very slightly confused how we went from "55%" on my earlier prototype branch, to "40%" here. I assume that some of it is that I added more benchmark scenarios like largeObject2 individual and reuse, and having a couple more scenarios that are +20% lowers the average. Feels like there might be a bit more in terms of the actual runtime speed, but I haven't pinpointed any differences yet.
If I do a pprof profile of that immutability-profiling.mjs multi-scenario script, the overall results look like this:
It seems that the real bottlenecks are actually Object.isFrozen/freeze() and {...base}:
shallowCopy
Total: 1.77s 1.77s (flat, cum) 16.59%
173 . . }
174 . . return O.create(getPrototypeOf(base), descriptors);
175 . . } else {
176 . . const proto = getPrototypeOf(base);
177 . . if (proto !== null && isPlain) {
178 1.77s 1.77s return { ...base };
179 . . }
180 . . const obj = O.create(proto);
181 . . return O.assign(obj, base);
182 . . }
183 . . }
freeze
Total: 1.01s 13.54s (flat, cum) 126.67%
180 . . const obj = O.create(proto);
181 . . return O.assign(obj, base);
182 . . }
183 . . }
184 . . function freeze(obj, deep = false) {
185 3.63ms 1.32s if (isFrozen(obj) || isDraft(obj))
186 . . return obj;
187 1.21ms 37.51ms if (getArchtype(obj) > 1) {
188 . . O.defineProperties(obj, {
189 . . set: dontMutateMethodOverride,
190 . . add: dontMutateMethodOverride,
191 . . clear: dontMutateMethodOverride,
192 . . delete: dontMutateMethodOverride
193 . . });
194 . . }
195 719.95ms 719.95ms O.freeze(obj);
196 . . if (deep)
197 263.78ms 11.42s each(
198 20.57ms 38.72ms obj,
199 . . (_key, value) => {
200 3.63ms 3.63ms freeze(value, true);
201 . . },
202 . . false
203 . . );
204 . . return obj;
205 . . }
Given that, I'm not sure there's much more to squeeze out of this without changing fundamental assumptions like freezing by default.
Given that, I'm not sure there's much more to squeeze out of this without changing fundamental assumptions like freezing by default.
Freezing was enabled by default as a perf gain: in the tree diff approach it would help skip diffing entire subtrees in subsequent produce runs. It is also great that it helps with correctness, but if it helps perf we can swap the default (or only enable freezing in DEV scenarios to still catch buggy code early)
Yeah, that's a good question, actually - now that we're not always recursing by default, is there a chance skipping freezing would be valid?
There's still several places in the new finalization logic that are still checking isFrozen to bail out, so that may still be enough of an embedded assumption it would take major changes. Might be worth investigating, though.
Ok I think the current release has soaked long enough and I'm happy to proceed with this one.
Note that some documentation updates are required as well, let me know if you want me to look into those @markerikson
Awesome! Yeah, I'll tackle it this weekend.
Okay, updated the PR:
- Rebased vs latest
mainand removed the stacking from the earlier PRs - Spent some time thinking through the relative behavior of
concatvsfilter. Concluded that it makes sense forfilterto return drafts for the included array items, because there's still a natural connection to the parent array (and thus the parent/child mutation tracking works). Forconcat, though, you end up with a potential mix of items. So, currentlyconcatdoesn't try to do any forced drafting at all, it's just whatever was in the underlying array. - I saw that we didn't have tests for the remaining array methods, so I generated a bunch of those. The generated tests have a conditional pair of tests for
concatandflatthat check "includes drafts" vs "includes base values" based on whether we're runningbase.jswith the array plugin enabled - Added AI-written JSDocs to the array methods implementation, and an AI-written docs page for the plugin. I am highly skeptical of AI-generated writing in general... but I reviewed all of this and honestly it seems reasonable? I corrected a few mis-statements, but overall I actually stand behind the docs content as written here. Seems pretty comprehensive, readable, and captures the tradeoffs in behavior. Opus 4.5 is pretty good here.
- Updated the
test:sizecommand to includeenableArrayMethodsand updated the import size results table