immutables icon indicating copy to clipboard operation
immutables copied to clipboard

How to efficiently track and store deltas between two HAMTs

Open aeftimia opened this issue 2 years ago • 1 comments

Ideally, I would like to be able to do

x = Map({'a': 2, 'c': 1})
y = x.update({'b': 3: 'c': 2)
z = y - x # magic
z == Map({'b': 3, 'c': 2})

Is there any particularly efficient way to do this in terms of memory and computational time? Ideally, I'd like z to share its data with y in the same way y shares its data with x. One way that comes to mind is

def diff(y, x):
  z = y
  for k, v in y.items():
    if k in x and x[k] == v:
      z = z.delete('k')
  return z

But this is O(N log N) (for log N get/set). Is there a more efficient way to go about this?

aeftimia avatar Dec 30 '21 01:12 aeftimia

This is not a HAMT specific problem. Depending on your use-case, you'd have to either pay a large computation during difference operation, or small computation for every single (or batch) change operation:

Option 1: Pretty much what you wrote, module your requirements.

Option 2: A wrapper class that appends every single set, delete, commit operation into a linear history list, enabling you to quickly get mutation differences as history index tuple. Differences between divergent histories from a common ancestor can be done efficiently by finding a largest common prefix of the two histories, which can be done with binary search thanks to cached hashes. But this option will not work if the two maps will not have a common ancestor map.

sizur avatar Jul 31 '23 22:07 sizur