citus icon indicating copy to clipboard operation
citus copied to clipboard

New get_cluster_clock() UDF prototype

Open tejeswarm opened this issue 2 years ago • 6 comments

Routine that returns monotonically increasing logical clock value, as close to epoch value (in milli seconds) as possible, with the guarantee that it will never go back from its current value even after a restart (not hard crash).

DESCRIPTION: PR description that will go into the change log, up to 78 characters

tejeswarm avatar Apr 09 '22 05:04 tejeswarm

Codecov Report

Merging #5898 (d1d27eb) into master (6254f30) will increase coverage by 0.00%. The diff coverage is 90.76%.

:exclamation: Current head d1d27eb differs from pull request most recent head 5855192. Consider uploading reports for the commit 5855192 to get more accurate results

@@           Coverage Diff            @@
##           master    #5898    +/-   ##
========================================
  Coverage   92.63%   92.63%            
========================================
  Files         232      232            
  Lines       47929    48118   +189     
========================================
+ Hits        44399    44575   +176     
- Misses       3530     3543    +13     

codecov[bot] avatar Apr 11 '22 22:04 codecov[bot]

High-level thought: Should we use the clock value as part of the transaction ID timestamp instead of GetCurrentTimestamp? (probably guard this behaviour with a GUC in case it leads to issues)

That way, we can see the oldest possible clock value that might be committed in get_all_active_transactions().

Downside is that the transaction ID generation happens earlier than get_cluster_clock(). On the other hand, we also have no guarantees about the amount of time between a call to get_cluster_clock() and commit. It is necessary for us to know which timestamps might still be committed, and we can get that when using get_all_active_transactions().

marcocitus avatar Apr 21 '22 12:04 marcocitus

High-level thought: Should we use the clock value as part of the transaction ID timestamp instead of GetCurrentTimestamp? (probably guard this behaviour with a GUC in case it leads to issues)

That way, we can see the oldest possible clock value that might be committed in get_all_active_transactions().

Downside is that the transaction ID generation happens earlier than get_cluster_clock(). On the other hand, we also have no guarantees about the amount of time between a call to get_cluster_clock() and commit. It is necessary for us to know which timestamps might still be committed, and we can get that when using get_all_active_transactions().

I'm thinking, we should keep the timestamp as is but add a new field for clock (may be hidden for now) and populate it at during CommitTransaction()->RecordTransactionCommit(..it goes to WAL here and seals it..), and later in the xactcallback stamp the transaction clock when the event is XACT_EVENT_COMMIT. Once we tie the loose ends, we can replace the clock with the timestamp. Note: As you already pointed out, we still have window between the transaction-commit recorded in the WAL and the get_cluster_clock() in the callback.

tejeswarm avatar Apr 21 '22 21:04 tejeswarm

Accidentally closed it, reopening.

tejeswarm avatar Apr 21 '22 21:04 tejeswarm

LGTM overall, but I guess @thanodnl also would like to have a look, and I'd like to do some manual testing today or tomorrow; so not approving yet.

And we also need to move sql files for those two new UDFs to under new version's directory after releasing Citus 11 today, and that would also help fixing merge conflicts.

onurctirtir avatar Jun 16 '22 11:06 onurctirtir

Is there a design document for this PR? (ideally in a large comment or .md file, if we want to merge this)

It seems interesting, but the guarantees we get, motivations, and applications are not entirely clear to me.

marcocitus avatar Jun 20 '22 09:06 marcocitus

superseded via #6315

onderkalaci avatar Oct 21 '22 14:10 onderkalaci