electric icon indicating copy to clipboard operation
electric copied to clipboard

feat(sync-service): Clean up publication filters

Open msfstef opened this issue 1 year ago • 3 comments

Closes https://github.com/electric-sql/electric/issues/1774

This work started to introduce column filters (see https://github.com/electric-sql/electric/issues/1831) but ended up on a road block because of us using REPLICA IDENTITY FULL - however the work also takes care of cleaning up filters.

  • Introduced singular process for updating publication - we were locking on it before anyway, might as well linearise it ourselves.
  • Process maintains reference counted structure for the filters per relation, including where clauses and filtered columns, in order to produce correct overall filters per relation
  • Update to the publication is debounced to allow batching together many shape creations
  • Every update does a complete rewrite of the publication filters so they are maintained clean - but also introduced a remove_shape call so that if electric remains with no shapes it should also have no subscriptions to tables.

TODOs

  • [x] Write tests for PublicationManager
  • [x] Write procedure for recovering in-memory state from shape_status.list_shapes in recover_shapes
  • [ ] Split where clauses at top-level ANDs to improve filter optimality (suggested be @icehaunter ) - [edit: not doing this now, as we can be smart about this an do even more "merging" of where clauses like x = 1 and x = 2 to x in (1, 2) - separate PR]

msfstef avatar Dec 11 '24 17:12 msfstef

Regarding https://github.com/electric-sql/electric/issues/1831 - we need to figure out when we need REPLICA IDENTITY FULL and make it so that it is only set when necessary (and column filters are removed in that case)

Once we determine if we can afford not to use REPLICA IDENTITY FULL at all times then we should be able to do column filtering easily with these changes.

msfstef avatar Dec 12 '24 14:12 msfstef

benchmark this

msfstef avatar Dec 12 '24 16:12 msfstef

Benchmark results, triggered for f8edd

concurrent shape creation completed

OLD

image

NEW

image

github-actions[bot] avatar Dec 12 '24 17:12 github-actions[bot]

benchmark this

msfstef avatar Dec 17 '24 10:12 msfstef

Benchmark results, triggered for 8d697

  • write fanout completed

write fanout results

  • diverse shape fanout completed

diverse shape fanout results

  • concurrent shape creation completed

concurrent shape creation results

  • many shapes one client latency completed

many shapes one client latency results

  • unrelated shapes one client latency completed

unrelated shapes one client latency results

github-actions[bot] avatar Dec 17 '24 10:12 github-actions[bot]