wal2json icon indicating copy to clipboard operation
wal2json copied to clipboard

Add support to include a list of column names for updates in the case…

Open greigwise opened this issue 5 years ago • 7 comments

… that toasted data for those columns is missing

I have added an option called include-missing-toast so that an additional data element will be added to the output which would list the names of all the columns which are not included in the case where the toast fields were not updated. This way consumers of the data could easily determine that certain fields are missing from the output and handle this case accordingly.

greigwise avatar Jan 23 '19 22:01 greigwise

Could you elaborate the use case? If this feature is related to UPDATE, the columns that was not informed will stay as is, hence, wal2json don't need to mention them. Even if you force with REPLICA IDENTITY FULL, you won't get them.

eulerto avatar Feb 12 '19 16:02 eulerto

Sure. So, I am creating a data warehouse where I want to use the logical decoding as a source to stage changes so I can have a full audit trail of the history of my data (as it's updated over time). The process I use to update this data back into my data warehouse expects all the columns to be present on an update.

If I know that certain columns were omitted from the data stream (due to the fact that they are un-updated TOAST fields), I can easily go back and re-read the old values for those fields. Otherwise, I have to create some crazy logic where I compare the list of columns given by wal2json with the list of actual columns in the table and figure out what's missing that way.

So, in short, it's just a way to know which columns were excluded from the data stream in the event of an update so that I can handle that case specially.

See also that I added a test case and updated the docs for this.

Thanks for your consideration.

greigwise avatar Feb 12 '19 17:02 greigwise

Hello. Is there any further information I can provide for this or anything else I need to do before we can proceed with this? Thanks.

greigwise avatar Mar 05 '19 17:03 greigwise

@euIerto I think that I'll run into this issue as well shortly. Nearly an identical use case.

jmealo avatar Apr 06 '19 19:04 jmealo

As an alternative to what I've done here, I could make it so that instead of having a new data element "missingtoastcols", I could include the "missing columns" in the columnnames array and then just use some kind of sentinel value in the columnvalues array indicating that that value is toasted and was not changed. If this is preferable to what I have here, let me know and I can submit a new PR.

greigwise avatar Apr 10 '19 19:04 greigwise

@eulerto this would be really useful. What would it take to get this tested and merged?

jfinzel avatar May 30 '19 00:05 jfinzel

Any reason why this change is not merged? Is there any way in the changelogs to find out which of the columns are made null because they are unchanged toasted columns?

If we clearly know all the column names which are made null because they are toasted and unchanged, we can differentiate between the actual nulls and nulls because of the unchanged toasted column.

ramaguruprakash avatar Apr 08 '21 23:04 ramaguruprakash