mobx icon indicating copy to clipboard operation
mobx copied to clipboard

Reduce memory overhead of tracking

Open realyze opened this issue 1 year ago • 8 comments

TL;DR Preallocating a size + 100 item array when tracking dependencies seems overly generous and can lead to more frequent garbage collections, causing jank. This PR replaces that with relative legroom (we preallocate 120% of space used by dependencies in the last run).

Example

Let's consider this scenario where we read 1M computed values in an autorun.

const obs = mobx.observable.box(1);
const arr = Array.from({length: 1_000_000}, () => mobx.computed(() => obs.get()));

mobx.autorun(() => {
  arr.forEach((val) => val.get());
});

function main() {
  // Recompute computeds in autorun.
  mobx.runInAction(() => obs.set(obs.get() + 1));
}

window.__main = () => main();

A million may seem like a lot but if we read e.g. 2000 computed values in every frame of an interaction and the interaction takes 5 seconds, we get 60fps * 5s * 2000 reads per frame and we're at 600K already. So: not super unrealistic, actually.

I've done some testing on our Canva codebase and the overhead of preallocations during some interactions (e.g. panning) is ~98% (i.e., we only use 2% of the preallocated space, rest gets GC'd).

Impact

Memory profiling with mobx production build from CDN, I get ~430MB allocated when running. __main() (i.e., we have huge swathes of memory that'll need to be GC'd). With this change, this drops to ~36MB.

Screenshot 2024-02-19 at 11 54 35 am Screenshot 2024-02-19 at 11 53 05 am

Devtools profiler also (unsurprisingly) reports shorter time spend in GC for the build with this change (37ms vs 57ms); overall the __main() function runs ~30% faster with this change (compared to current main).

Of course, this is a lab scenario so the difference in real-world cases would presumably be different. However, I think it's quite reasonable to expect that in P99 case the number of dependencies between derivation runs doesn't increase dramatically so a 20% legroom instead of a large constant one seems reasonable.

Code change checklist

  • [ ] Added/updated unit tests
  • [ ] Updated /docs. For new functionality, at least API.md should be updated
  • [x] Verified that there is no significant performance drop (yarn mobx test:performance)

realyze avatar Feb 18 '24 10:02 realyze

🦋 Changeset detected

Latest commit: fcbcf1797f0f122665ac72a7cb0ff68891a477d9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
mobx Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

changeset-bot[bot] avatar Feb 18 '24 10:02 changeset-bot[bot]

Ah sorry, it's been a while since I last contributed to OS, I forgot that creating a PR on my fork will create the PR on the main repo. 😆 Let me get some memory profiling numbers first (in the coming days! and update the description.

realyze avatar Feb 18 '24 10:02 realyze

@urugator 👋 WDYT? I'm not sure who to ask for review here but you've always been awesomely responsive so I thought of you. 😅 I'd be keen to get this in since I'm hopeful it could help remove some GC-related jank during intensive interactions.

realyze avatar Feb 22 '24 04:02 realyze

I am wondering about 2 things:

  1. What the right heuristic should be. Would it make sense to have some minimum or different calculation? 20% on low numbers does almost nothing. Would it be worth to make it configurable (globally/per derivation)?
  2. Is prealocating room for variations actually needed/useful? I would imagine engines already handle this. Just skimmed through, but eg: https://itnext.io/v8-deep-dives-understanding-array-internals-5b17d7a28ecc

This allows V8 to achieve constant amortized time for push() and similar operations that grow the array. The following formula is used to determine the new size in situations when the internal array is not enough: new_capacity = (old_capacity + 50%) + 16

I will try to notify @mweststrate to provide feedback/blessing.

urugator avatar Feb 22 '24 09:02 urugator

Great feedback, thanks, @urugator !

Would it make sense to have some minimum or different calculation? 20% on low numbers does almost nothing. Would it be worth to make it configurable (globally/per derivation)?

Good question. We use Math.ceil so we'd always preallocate at least one extra slot...which may or may not be enough but it's in line with the thinking that the deps count will likely not change a lot between runs (i.e., that more likely than not it would be enough).

Is prealocating room for variations actually needed/useful? I would imagine engines already handle this. Just skimmed through, but eg: itnext.io/v8-deep-dives-understanding-array-internals-5b17d7a28ecc

Excellent point. 👍 And I agree. I think the inital-run preallocation is the one that might make sense (because it can prevent the runtime from reallocating the array). But then again, it's unclear to me whether in average case what we save by avoiding the array reallocs isn't lost by the time we spend in GC. 🤷

Let's see if @mweststrate could shed more light on what the thinking was when we added the preallocation logic (which was...8 years ago so I hope his memory is better than mine 😆 )

realyze avatar Feb 23 '24 04:02 realyze

Sorry for the late follow up, I've been a bit swamped lately. Did you also run the performance suite in the repo @realyze ? But I think you are right, this seems overly aggressive. I think either deps are typically pretty stable, and if not, they can be hugely varying (e.g. a conditional filter over a large map or so). I suspect you could even merely reduce it to derivation.observing_.length and leave the rest to the engine

mweststrate avatar Feb 23 '24 10:02 mweststrate

Thanks for the reply, @mweststrate ! I ran yarn mobx test:performance (I assume that's the perf suite) and it all passed.

realyze avatar Feb 23 '24 11:02 realyze

Yeah that is correct, did you compare the numbers before/after? (I wouldn't expect something stat sig, but just in case)

On Fri, Feb 23, 2024 at 12:26 PM Tomas Brambora @.***> wrote:

Thanks for the reply, @mweststrate https://github.com/mweststrate ! I ran yarn mobx test:performance (I assume that's the perf suite) and it all passed.

— Reply to this email directly, view it on GitHub https://github.com/mobxjs/mobx/pull/3833#issuecomment-1961159681, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN4NBCQASTOEEFP7HI2XYDYVB4FRAVCNFSM6AAAAABDOBGWB6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRRGE2TSNRYGE . You are receiving this because you were mentioned.Message ID: @.***>

mweststrate avatar Feb 23 '24 11:02 mweststrate

OK, here's my results based on running the proxy perf suite 10 times and recording the time as reported by the shell time call: https://docs.google.com/spreadsheets/d/1b_TE_3xjiLhfxCKwc9UGOz_g-LUaG3cPNBpV3lzes9E/edit?usp=sharing

Both user and sys times are faster with this PR compared to current mobx master. The user one is has slightly higher variance so it's probably about the same. But the sys one—which should reflect the time spent on memory allocations—is significantly lower for this PR: ~9% less time spent in sys (with slightly lower variance compared to original).

So at least within the context of the perf suite, this change seems to make mobx faster.

realyze avatar Mar 04 '24 22:03 realyze

This looks great, thanks for the detailed measurement. LGTM!

mweststrate avatar Mar 05 '24 08:03 mweststrate

I don't want to delay this PR any further, just a thought I want to leave here: In case of zero deps, we could set newObserving_ to null/undefined and create the array lazily when needed. This would avoid the need to allocate and then GC the array, which memory wise is not of zero length. But it can also negatively impact all other cases - due to the extra check for undefined when registering a depedency.

urugator avatar Mar 06 '24 13:03 urugator

In case of zero deps, we could set newObserving_ to null/undefined and create the array lazily when needed

I agree. On the other hand glancing at the code, there are some non-null assertions in the codebase (derivation.newObserving_!) so I'm not sure what that could possibly break. 😅 So I'd rather take this win (where I'm cofident mobx won't start throwing NPEs at us) and leave possible future optimizations to the future - with the obvious risk that that might never happen but as I tried to argument here, I believe optimizing the non-initial case will yield the biggest benefits anyway.

realyze avatar Mar 06 '24 22:03 realyze

One last thing, can you add changeset please? yarn changeset Sorry, I didn't noticed until I was about to merge...

urugator avatar Mar 07 '24 08:03 urugator

Ah neat, done via fcbcf17!

realyze avatar Mar 07 '24 10:03 realyze

Thank you!

urugator avatar Mar 07 '24 10:03 urugator

@mweststrate / @urugator I'm not sure how the mobx npm release process works but could we please release 6.12.1 soon? 🙏

realyze avatar Mar 12 '24 22:03 realyze

how the mobx npm release process works

Sadly, it tends not to work: https://github.com/mobxjs/mobx/actions/runs/8213813479/job/22465634950#step:6:65

urugator avatar Mar 13 '24 16:03 urugator

Ugh again. Sorry was focused on Immer last week, but will try to take a look at Friday.

On Wed, Mar 13, 2024 at 5:31 PM urugator @.***> wrote:

how the mobx npm release process works

Sadly, it tends not to work:

https://github.com/mobxjs/mobx/actions/runs/8213813479/job/22465634950#step:6:65

— Reply to this email directly, view it on GitHub https://github.com/mobxjs/mobx/pull/3833#issuecomment-1994895893, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN4NBHFJUUL3KCTCGK3T6TYYB5NTAVCNFSM6AAAAABDOBGWB6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJUHA4TKOBZGM . You are receiving this because you were mentioned.Message ID: @.***>

mweststrate avatar Mar 13 '24 20:03 mweststrate

@mweststrate gentle ping please 😆 Anything I could do to help?

realyze avatar Mar 18 '24 22:03 realyze

ok think I found the culprit

mweststrate avatar Mar 28 '24 20:03 mweststrate