enketo-core icon indicating copy to clipboard operation
enketo-core copied to clipboard

Refactor: unified (re)computation and caching of repeat indexes

Open eyelidlessness opened this issue 2 years ago • 0 comments

Motivation

Determining the repeat position for a given node is currently expensive, with a few disparate implementations throughout the codebase. Some of these implementations are challenging to maintain/understand. My goal is to:

  1. Provide a single, unified implementation for both the model and the view
  2. Use native DOM APIs to minimize the performance impact
  3. Cache the result for affected nodes

How

The DOM provides a MutationObserver API which is very well suited for this purpose. It's specifically designed for reacting to changes to DOM state in order to update dependent state. It has good performance characteristics, observers are asynchronous/non-blocking by default but can be synchronously flushed with the takeRecord method for cases where synchronous updates are necessary. Observers can be filtered to be as broad or fine-grained as needed—e.g. by the childList node type filter, by attaching to select parent elements like the parent element of a repeat, and by shallowness or depth with the subtree filter.

While MutationObserver is great for reacting to changes, we'll also need to initialize repeat state on form load so we know what to update. For this, the DOM also provides the TreeWalker API, which also has excellent performance characteristics and helpful filters to minimize the work performed while traversing the DOM.

Finally, for implementing the internal state representation itself, WeakMap is another great fit for the problem. The idea here is to maintain an in-memory mapping between repeat instances (and their pertinent children) and their current repeat paths and indexes, as well as between their model and view representations. Lookups are fast, and key references can be freely garbage collected when they're no longer in scope.

Impact

Besides the maintenance benefits of having a unified implementation and using built-in standards which are well fit to the task, it's no mistake that a lot of the discussion above is focused on performance. I have a working proof of concept implementing most of what I described above. I've been using the SOAR form fixture as a test case while building out the proof of concept because it's known to load slowly due to repeat initialization.

Changing the implementation alone won't necessarily show much load time improvement, but there are numerous places in the codebase which call out workarounds to avoid the expensive index lookup, and several more workarounds which aren't explicitly called out. By reacting to changes as they occur, and caching the results, those workarounds should no longer be necessary. This will likely have a more significant performance impact.

More importantly, removing those workarounds would reduce the surface area of repeat-specific code, thereby reducing the potential for subtle bugs and inconsistencies, and increasing the ability to deliver fixes and improvements without as much consideration for the impact of/on repeats. And by unifying the implementation, it will ensure repeat state in both the model and view are always consistent.

Lastly, this a first step that will provide a foundation for similar state maintenance improvements. It will establish a precedent for defining a mapping between view and model nodes more generally (which I may include in this work, it's pretty much already implemented in my proof of concept), and potentially other more specific dependencies between nodes.

eyelidlessness avatar Jun 02 '22 21:06 eyelidlessness