databend icon indicating copy to clipboard operation
databend copied to clipboard

Feature: retention time marker procedure

Open dantengsky opened this issue 2 years ago • 2 comments

Summary

Provides a way of marking historical snapshots invisible, so that the old snapshots( and maybe the data it referenced) can fade away gradually.


Basic desc of functionalities:

Marks the latest visible snapshot of the given table.

  • A system configuration, let's say table_retention_time: Duration,
  • A system procedure, which marks the latest visible snapshot of a table
    • by insert/update a specified key of the KV service
    • TimeTravel of table data will respect this mark

NOTE: The query nodes work on their local clocks, which is NOT perfectly synced


basic idea of impl:

  • provides a system procedure, let's say call system$retention_mark([database_name,] table_name)
    • grab meta data of the table specified
    • check if key LATEST_VISBLE_SNAPHOST of the give table exist LATEST_VISBLE_SNAPHOST/<tid> -> timestamp
      • if it exist and value of it is less than (now() + table_retention_time) try to update it to (now() + table_retention_time)
      • if it does not exist try to insert the kv pair
    • And of course, the mutations should be executed in a kv transaction
      • the most important invariant of this operation value of LATEST_VISBLE_SNAPHOST/<tid> should only be increased

Notes:

  • if database_name is not provided, use the context's current database name
  • A "hurry" marker, whose clock is crazily ahead of time, may mark the LATEST_VISBLE_SNAPHOST "incorrectly"
    • Have to live with it, hoping it is not too crazy : )

      e.g. if the clock is two months ahead of time. The history of the table may be not accessible in the next 2 months.

      To intimidate this situation: The value of LATEST_VISBLE_SNAPHOST/<tid> could be changed to the timestamp of the snapshot, by navigating to the snapshot S at (now() + table_retention_time). Thus, snapshots generated after S, could be accessible, if clocks go back to normal.

    • The "current snapshot" referenced by the KV meta, is always visible

dantengsky avatar May 23 '22 05:05 dantengsky