accumulo
accumulo copied to clipboard
Update FATE transaction ids to be globally unique across multiple stores
While working on changes for #3559 to store FATE operations inside an Accumulo Table, I realized that we will need some way to track FATE transactions globally. After we update FATE to store operations in Accumulo we still need the ZK store for FATE operations for the root/metadata tables (for example). Also we might want to have multiple Accumulo stores depending on how we want to design things such as storing operations for users separately from system operations in different tables.
Regardless of how the exact design turns out we are going to need multiple stores and right now FATE transactions are unique to a single store so we need some way to make sure things are unique. At the same time we could refactor the transaction id to be a better Id than just a long.
This can help prevent bugs like creating a FATE operation in FATE instance A and trying to use the id in FATE instance B. Could also be useful for debugging, if when a FATE id is logged its easy to see what FATE instance the id came from.
I would like to work on this
The end goal is to have the stronger type FateId replace the current representation of a transaction id (which is just a long). This was brought about from the addition of the AccumuloStore class - there are now two fate instance types associated with a transaction - META (for ZooStore) or USER (for AccumuloStore). FateId is a new class which includes the FateInstanceType and the transaction id.
TODO list for this issue:
- [x] Create new class FateId to replace FateTxId
- [x] Change the stores to use FateId. Start with ReadOnlyFateStores. Resolve issues stemming from these changes.
- [x] Change Fate to use FateId. Resolve issues stemming from these changes.
(the above have been completed and merged in by PR#4191)
- [x] Change Repo to use FateId. Resolve issues stemming from these changes.
(the above has been completed and merged in by PR#4228)
- [x] CompactionConfigStorage, SelectedFiles, TabletUpdates, TabletMetadata, Ample need to be updated to use FateId. Deferred for now. All the places these classes are used have been marked "ELASTICITY_TODO DEFERRED - ISSUE 4044"
(the above has been completed and merged by PR#4247)
- [x] TabletOperationId, VolumeManager, TabletRefresher, TExternalCompactionJob, and Utils need to be updated to use FateId. Deferred for now. All the places these classes are used have been marked "ELASTICITY_TODO DEFERRED - ISSUE 4044"
- [x] A couple of deferred changes to Compactor and CompactionCoordinator (in PR#4258) (need PR#4247 merged first). Marked with "ELASTICITY_TODO DEFERRED - ISSUE 4044".
(the above has been completed and merged by PR#4258)
- [x] AdminUtil and Admin need to use FateId. Deferred for now. Marked with "ELASTICITY_TODO DEFERRED - ISSUE 4044". (issue#4168)
(the above has been completed and merged by PR#4350)
- [x] Delete FateTxId when all uses have been replaced with FateId (issue#4275)
(the above has been completed and merged by PR#4370)
During this change, it would be desirable for the FATE transaction ids to nativity sort by creation timestamp. That would allow for determining order of operations of things solely by examining the FATE ids. This would apply to listing things in the fate store as well as looking at logs. If FATE_ID_1 < FATE_ID_2 then it can be immediately seen that FATE_ID_1 was created before FATE_ID_2 (for whatever timestamp precision is available)
One way to provide this could be to use UUIDs that conform to the emerging UUIDv7 standard. The gist of UUIDv7 - they are 128 bit UUIDs that put the timestamp portion of the UUID first, and random bits at the end with other identifying info in the middle. There are variants that are UUIDv7 compatible that allow for sub-second timing information if that precision is wanted.
All above TODOs have been completed. I believe this issue can be closed now