electric icon indicating copy to clipboard operation
electric copied to clipboard

Electric should be resilient to disconnecting to Postgres

Open KyleAMathews opened this issue 1 year ago • 6 comments

Whether it's Postgres restarting or network connectivity issues.

  • [ ] Write a test to verify Electric's graceful handling of Postgres connection closures.

KyleAMathews avatar Jul 24 '24 22:07 KyleAMathews

I suggest moving it outside Second alpha, given it will only be fully addressed by electric-sql/electric-next#34 and electric-sql/electric-next#106

balegas avatar Jul 25 '24 09:07 balegas

Hmm yeah — how much work is left to fix those @alco? It does seem like basic resiliency to starting / stopping backend services would be good to get in early as the goal is to make Electric stable for developing locally and deploying relatively simple applications. The theme of "Second Alpha" is basically "make all the normal things work".

KyleAMathews avatar Jul 25 '24 12:07 KyleAMathews

https://github.com/electric-sql/electric-next/pull/34 has been ready for review since last week. I keep rebasing it from time to time to resolve conflicts. It solves the core need of being able to resume replication from Postgres after the replication connection is closed and reopened.

Now that I'm thinking of this, we're missing a test that would verify idempotent processing of transactions if it so happens that the replication connection closes right after we've persisted a transaction to the shape log but just before we have acknowledged its LSN to Postgres. Adding this as a TODO to the 2nd PR - https://github.com/electric-sql/electric-next/pull/106.

The latter PR is missing one key thing: reconnection logic. I had to disable auto-reconnection in Postgrex.ReplicationConnection that we'd been using in order to be able to handle replication slot errors. We now need to put our own reconnection logic in place.

alco avatar Jul 25 '24 14:07 alco

Great! Let's get these reviewed and in early next week :shipit:

KyleAMathews avatar Jul 25 '24 15:07 KyleAMathews

https://github.com/electric-sql/electric-next/pull/205 makes Electric handle Postgres disconnection gracefully and reconnect when it's back up.

There's no automated testing to verify this behaviour yet. I'll leave this issue open until we have a way to test the resilience.

alco avatar Jul 30 '24 15:07 alco

@alco shall we create the test and close this?

balegas avatar Aug 07 '24 14:08 balegas