storage/sources/postgres: support `REPLICA IDENTIFY DEFAULT`
We currently require that all tables participating in a publication have their REPLICA IDENTITY set to FULL. This is because we want to be able to ingest data with minimal in-memory state and rely on the upstream database to provide the previous data of a changed row in the case of retractions.
REPLICA IDENTITY FULL however increases the amount of data that transits the upstream's WAL and that might be operationally undesirable. Materialize already knows how to deal with key-based retractions through the UPSERT envelope so we should be able to support replicating postgres tables with REPLICA IDENTITY DEFAULT by passing them through an UPSERT operator, ideally automatically.
From the list of replicated tables we can determine which ones need an UPSERT state during purification so that we only render it when it's necessary.
Looks like we can close #15874 in favour of this one.
From the list of replicated tables we can determine which ones need an
UPSERTstate during purification so that we only render it when it's necessary.
A bit of danger here, though, in that we only fetch the metadata about those tables when the source is created, right? We should clearly specify the behavior in the docs. Running ALTER REPLICA IDENTITY in PostgreSQL after you've created the source in Materialize is not going to work, right?
That's right, it needs to stay the same. Fortunately this is detectable so there isn't a correctness ranger here, only UX danger which we should make clear in the docs