hookable
hookable copied to clipboard
perf: improve performance, reduce allocations, and avoid promises
Hi there! I love micro-optimizations. During one of my profiling sessions of my application on low-end devices, I noticed a significant consumption of hookable on the flame graph. After reviewing the code, I identified some possible improvements, which I’ve included in this PR. I’ve added benchmarks to support my claims. Here are some of the improvements:
- Instead of modifying the arguments through
unshift, we can directly passnametocallHookWith, as all the arguments there are collected using the spread operator. This removes unnecessary allocations and improves performance. - We should avoid
Promiseas much as possible because it is a performance killer. I replaced thePromisechain with a middleware pattern inserialTaskCaller. - Early returns in
serialTaskCallerandparallelTaskCallerallow us to avoid unnecessary operations if there are no hooks. However, this breaks the test forcreateDebuggeras it returns the hook name in the arguments. I need consultation on this matter, as skipping the step significantly boosts performance. How should we proceed? - Using
deletenegatively impacts performance as it de-optimizes the object. A better option is to set the property toundefined. The only downside is that if we print the_hooksobject, we’ll see something like{ "hello": undefined }. - Removing
Object.assignindeprecateHooks. Since we call thedeprecateHookmethod anyway, nothing will be missed. - In
addHooks, we can set an empty array instead of callingsplicein the release function each time. Plus, this approach uses fewer characters. - In
removeAllHooks, we can simply set an empty object and let the GC do the work. The only justification for the current behavior might be to keep a reference to the old object, but I don’t see any objective reasons for this. - In
removeHook, we can create an alias that slightly reduces the final bundle size and improves performance, as we don't need to compute the property.
There are also potential performance improvements, such as:
- In
callHookWith, we could skip thecallercall if there are no hooks to call, but this is a major change, and I’m unsure about it. This could eliminate the need for the third optimization. - A good option would be to avoid
args.shift()in utility functions.
I hope the entire Vue ecosystem will benefit greatly from these changes. If I'm wrong about anything, please correct me 😅
Performance was tested on the following hardware:
CPU: AMD 7950x3D
System: Arch Linux on Windows 11, WSL 2
Node.js: v22.4.0
Before
✓ test/hookable.bench.ts (30) 40627ms
✓ empty serialTaskCaller (3) 39618ms
name hz min max mean p75 p99 p995 p999 rme samples
· empty serialTaskCaller 10,182,469.12 0.0001 0.5928 0.0001 0.0001 0.0002 0.0004 0.0005 ±1.10% 5091235 fastest
· empty serialTaskCaller with argument 9,965,174.09 0.0001 0.5196 0.0001 0.0001 0.0002 0.0003 0.0005 ±0.27% 4982588
· empty serialTaskCaller with arguments 8,588,290.33 0.0001 19.4391 0.0001 0.0001 0.0004 0.0004 0.0012 ±7.62% 4294146 slowest
✓ serialTaskCaller (3) 33950ms
name hz min max mean p75 p99 p995 p999 rme samples
· serialTaskCaller 2,354,705.43 0.0003 0.3582 0.0004 0.0004 0.0008 0.0009 0.0013 ±0.82% 1177353
· serialTaskCaller with argument 2,467,112.19 0.0003 0.7581 0.0004 0.0004 0.0008 0.0009 0.0012 ±0.62% 1233557 fastest
· serialTaskCaller with arguments 1,799,070.06 0.0003 1.1227 0.0006 0.0004 0.0009 0.0010 0.0013 ±3.48% 899539 slowest
✓ empty parallelTaskCaller (3) 40624ms
name hz min max mean p75 p99 p995 p999 rme samples
· empty parallelTaskCaller 4,975,588.58 0.0001 0.5751 0.0002 0.0002 0.0005 0.0006 0.0008 ±1.18% 2487795 fastest
· empty parallelTaskCaller with argument 4,067,468.90 0.0001 7.5485 0.0002 0.0002 0.0005 0.0006 0.0011 ±6.10% 2033879
· empty parallelTaskCaller with arguments 3,260,399.92 0.0001 10.4565 0.0003 0.0002 0.0006 0.0009 0.0022 ±10.64% 1630200 slowest
✓ parallelTaskCaller (3) 37518ms
name hz min max mean p75 p99 p995 p999 rme samples
· parallelTaskCaller 1,208,877.28 0.0006 0.4159 0.0008 0.0009 0.0014 0.0015 0.0025 ±0.80% 604440 fastest
· parallelTaskCaller with argument 851,025.12 0.0006 31.1798 0.0012 0.0010 0.0018 0.0029 0.0055 ±13.90% 425513
· parallelTaskCaller with arguments 848,773.69 0.0007 3.5288 0.0012 0.0009 0.0015 0.0017 0.0035 ±5.44% 424387 slowest
✓ empty callHook (3) 35453ms
name hz min max mean p75 p99 p995 p999 rme samples
· empty callHook 3,324,678.32 0.0002 0.3023 0.0003 0.0003 0.0006 0.0007 0.0010 ±0.89% 1662340 fastest
· empty callHook with argument 2,558,739.92 0.0002 2.0208 0.0004 0.0003 0.0006 0.0007 0.0011 ±4.25% 1280209
· empty callHook with five arguments 2,414,266.77 0.0002 2.3941 0.0004 0.0003 0.0006 0.0007 0.0011 ±4.88% 1207134 slowest
✓ empty callHookParallel (3) 36265ms
name hz min max mean p75 p99 p995 p999 rme samples
· empty callHookParallel 2,155,449.37 0.0003 0.4095 0.0005 0.0004 0.0008 0.0009 0.0014 ±0.95% 1077725 fastest
· empty callHookParallel with argument 1,756,949.99 0.0004 1.2595 0.0006 0.0004 0.0009 0.0009 0.0016 ±3.44% 878475
· empty callHookParallel with five arguments 1,719,582.79 0.0004 2.8272 0.0006 0.0004 0.0008 0.0009 0.0015 ±4.62% 859793 slowest
✓ callHook (3) 34625ms
name hz min max mean p75 p99 p995 p999 rme samples
· callHook 1,346,819.92 0.0005 0.3296 0.0007 0.0007 0.0012 0.0014 0.0024 ±0.81% 673410 fastest
· callHook with argument 1,076,396.49 0.0005 1.2023 0.0009 0.0008 0.0013 0.0014 0.0029 ±3.40% 538199
· callHook with five arguments 1,016,019.73 0.0006 1.5434 0.0010 0.0008 0.0013 0.0014 0.0024 ±4.28% 508010 slowest
✓ callHookParallel (3) 36871ms
name hz min max mean p75 p99 p995 p999 rme samples
· callHookParallel 831,923.92 0.0009 0.4366 0.0012 0.0012 0.0018 0.0020 0.0046 ±1.13% 415963 fastest
· callHookParallel with argument 681,418.62 0.0009 1.9961 0.0015 0.0013 0.0020 0.0023 0.0048 ±3.88% 340710
· callHookParallel with five arguments 605,805.38 0.0009 2.5282 0.0017 0.0013 0.0019 0.0021 0.0055 ±5.52% 302903 slowest
✓ hook (2) 29333ms
name hz min max mean p75 p99 p995 p999 rme samples
· hook 1,809,695.67 0.0003 5.3687 0.0006 0.0004 0.0007 0.0014 0.0025 ±8.52% 904848
· hook with deprecate 1,910,707.06 0.0003 5.7345 0.0005 0.0004 0.0007 0.0010 0.0021 ±8.02% 955354 fastest
✓ addHooks (1) 10517ms
name hz min max mean p75 p99 p995 p999 rme samples
· addHooks 1,802,287.39 0.0004 8.8805 0.0006 0.0005 0.0008 0.0009 0.0015 ±5.31% 901144 fastest
✓ empty removeHook (1) 19763ms
name hz min max mean p75 p99 p995 p999 rme samples
· empty removeHook 27,060,455.89 0.0000 0.9647 0.0000 0.0000 0.0000 0.0001 0.0002 ±0.41% 13530228 fastest
✓ removeHook (2) 33179ms
name hz min max mean p75 p99 p995 p999 rme samples
· removeHook 24,541,013.31 0.0000 0.0635 0.0000 0.0000 0.0001 0.0001 0.0003 ±0.13% 12270507 fastest
· removeHook with extra 24,464,448.73 0.0000 1.0412 0.0000 0.0000 0.0001 0.0001 0.0002 ±0.43% 12232225
After
✓ test/hookable.bench.ts (30) 67998ms
✓ empty serialTaskCaller (3) 67996ms
name hz min max mean p75 p99 p995 p999 rme samples
· empty serialTaskCaller 21,718,024.22 0.0000 0.0998 0.0000 0.0000 0.0001 0.0001 0.0003 ±0.14% 10859013
· empty serialTaskCaller with argument 21,886,593.12 0.0000 0.0895 0.0000 0.0001 0.0001 0.0001 0.0002 ±0.11% 10943297 fastest
· empty serialTaskCaller with arguments 21,578,860.84 0.0000 0.0727 0.0000 0.0001 0.0001 0.0001 0.0002 ±0.11% 10789431 slowest
✓ serialTaskCaller (3) 55928ms
name hz min max mean p75 p99 p995 p999 rme samples
· serialTaskCaller 5,263,373.05 0.0001 0.6182 0.0002 0.0002 0.0004 0.0005 0.0008 ±1.82% 2631687 fastest
· serialTaskCaller with argument 4,899,317.81 0.0001 0.9551 0.0002 0.0002 0.0004 0.0005 0.0007 ±2.32% 2449659
· serialTaskCaller with arguments 4,829,209.59 0.0002 1.4443 0.0002 0.0002 0.0004 0.0005 0.0007 ±2.07% 2414605 slowest
✓ empty parallelTaskCaller (3) 64743ms
name hz min max mean p75 p99 p995 p999 rme samples
· empty parallelTaskCaller 21,074,124.31 0.0000 0.0936 0.0000 0.0001 0.0001 0.0001 0.0002 ±0.15% 10537063
· empty parallelTaskCaller with argument 21,614,792.23 0.0000 0.0801 0.0000 0.0001 0.0001 0.0001 0.0003 ±0.12% 10807397 fastest
· empty parallelTaskCaller with arguments 20,499,101.75 0.0000 1.5945 0.0000 0.0001 0.0001 0.0001 0.0002 ±0.64% 10249552 slowest
✓ parallelTaskCaller (3) 53768ms
name hz min max mean p75 p99 p995 p999 rme samples
· parallelTaskCaller 1,248,716.38 0.0006 0.2534 0.0008 0.0009 0.0012 0.0013 0.0020 ±0.50% 624359 fastest
· parallelTaskCaller with argument 1,142,040.47 0.0006 0.7628 0.0009 0.0008 0.0012 0.0013 0.0026 ±2.21% 571021
· parallelTaskCaller with arguments 1,045,581.42 0.0006 2.1857 0.0010 0.0008 0.0012 0.0014 0.0028 ±3.35% 522791 slowest
✓ empty callHook (3) 58239ms
name hz min max mean p75 p99 p995 p999 rme samples
· empty callHook 13,264,257.60 0.0000 1.4190 0.0001 0.0001 0.0001 0.0001 0.0004 ±3.64% 6632129
· empty callHook with argument 18,871,312.11 0.0000 0.0529 0.0001 0.0001 0.0001 0.0001 0.0003 ±0.11% 9435657 fastest
· empty callHook with five arguments 12,472,483.13 0.0000 11.9884 0.0001 0.0001 0.0001 0.0002 0.0005 ±10.83% 6236242 slowest
✓ empty callHookParallel (3) 61137ms
name hz min max mean p75 p99 p995 p999 rme samples
· empty callHookParallel 17,993,167.78 0.0000 0.5364 0.0001 0.0001 0.0001 0.0001 0.0004 ±0.32% 8996584 fastest
· empty callHookParallel with argument 14,582,408.82 0.0000 2.2001 0.0001 0.0001 0.0001 0.0001 0.0004 ±3.57% 7298139 slowest
· empty callHookParallel with five arguments 17,755,090.97 0.0000 0.8180 0.0001 0.0001 0.0001 0.0001 0.0004 ±0.35% 8877546
✓ callHook (3) 54742ms
name hz min max mean p75 p99 p995 p999 rme samples
· callHook 4,098,577.78 0.0002 2.8716 0.0002 0.0002 0.0005 0.0006 0.0009 ±2.59% 2049289 fastest
· callHook with argument 3,925,879.71 0.0002 1.7039 0.0003 0.0002 0.0005 0.0005 0.0007 ±2.96% 1962940
· callHook with five arguments 2,615,867.56 0.0002 11.8876 0.0004 0.0002 0.0006 0.0009 0.0026 ±13.11% 1307934 slowest
✓ callHookParallel (3) 53109ms
name hz min max mean p75 p99 p995 p999 rme samples
· callHookParallel 984,106.82 0.0008 0.3530 0.0010 0.0010 0.0015 0.0017 0.0042 ±0.75% 492092 fastest
· callHookParallel with argument 914,175.54 0.0008 0.7117 0.0011 0.0010 0.0015 0.0016 0.0042 ±2.13% 457088
· callHookParallel with five arguments 822,263.16 0.0008 1.9657 0.0012 0.0010 0.0015 0.0017 0.0046 ±3.48% 411132 slowest
✓ hook (2) 32000ms
name hz min max mean p75 p99 p995 p999 rme samples
· hook 1,789,657.75 0.0003 5.2691 0.0006 0.0004 0.0009 0.0016 0.0026 ±7.30% 894829 fastest
· hook with deprecate 1,772,630.95 0.0003 9.5252 0.0006 0.0004 0.0007 0.0011 0.0024 ±9.79% 886316
✓ addHooks (1) 4528ms
name hz min max mean p75 p99 p995 p999 rme samples
· addHooks 1,879,948.50 0.0004 10.4049 0.0005 0.0005 0.0008 0.0009 0.0013 ±5.74% 939975 fastest
✓ empty removeHook (1) 29513ms
name hz min max mean p75 p99 p995 p999 rme samples
· empty removeHook 27,107,740.59 0.0000 0.1260 0.0000 0.0000 0.0000 0.0001 0.0002 ±0.15% 13553871 fastest
✓ removeHook (2) 52477ms
name hz min max mean p75 p99 p995 p999 rme samples
· removeHook 25,901,773.12 0.0000 0.0860 0.0000 0.0000 0.0001 0.0001 0.0002 ±0.13% 12950887
· removeHook with extra 26,255,315.21 0.0000 1.1736 0.0000 0.0000 0.0001 0.0001 0.0002 ±0.47% 13127658 fastest
@pi0 can you review this?
Thanks for your efforts on this dear @negezor it looks all good improvements ❤️ Sure i will carefully review once i had time (please read this for some understanding)
@pi0 gentle ping 😅