accumulo
accumulo copied to clipboard
Fate reservations moved out of memory
closes #4131 Fate reservations were moved out of memory. This is a WIP PR* for one part that is needed for having Fate be distributed in the future.
- Reservations for MetaFateStore were moved out of memory into ZooKeeper
- Reservations for UserFateStore were moved out of memory into the Accumulo Fate table
- Is an additional column which just indicates whether the FateId of that row is reserved or not
- Added test MultipleStoresIT
- Tests the new functionality of how reservations are stored
- I also included a couple tests in MultipleStoresIT that are testing the reserve/unreserve functionality and that it still works as intended. This can be moved from this class if desired.
- Verified existing tests in the fate test package (FateIT, FateOpsCommandsIT, FateStoreIT, ZooMutatorIT, FateMutatorImplIT, UserFateStoreIT) still pass
*There are still a few things I have marked as TODO that I would like input/suggestions on. These are marked in the code as "TODO 4131".
Marking this as ready for review, since it may not be reviewed otherwise
I have made the following changes in https://github.com/apache/accumulo/pull/4524/commits/e6671fbb903da53014bd64e53ad125351add6a04:
- Combined the LockID and reservation attempt UUID into one new object: FateReservation
- Now cover edge case with tryReserve() where a write may make it to the server, but the server dies before a response is received
- New FATE Thread: DeadReservationCleaner which deletes reservations held by Managers that have since died
- Created new classes: ColumnValueMappingIterator and ReservationMappingIterator. ColumnValueMappingIterator abstracts out the common functionality of ReservationMappingIterator and StatusMappingIterator. ReservationMappingIterator is an iterator used for determining if the reservation column for a FateId has a FateReservation set or not
- Expanded/improved/simplified MultipleStoresIT tests
There are also some comments/questions labeled "TODO 4131" that I would like input on.
I also have to resolve the merge conflicts which I will do now. (edit: resolved)
@keith-turner Thanks for the review. I addressed the suggested changes. This PR is ready for re-review
@keith-turner, @DomGarguilo I have addressed all of the review changes, and this PR is ready for re-review. However, some tests are no longer passing (e.g., ComprehensiveIT). I believe this is an issue with my new implementation of createAndReserve. I believe the issue is with the UserFateStore implementation, but could be both. Posting this new commit now to get the other changes out, and to potentially receive feedback on what might be wrong with createAndReserve(). I will continue to look into what the issue might be in the meantime. Please take a look at my new review comments and my new commit message
I have found and addressed the bug in my most recent commit. This PR is now ready for full review. The tests verified to still pass: the sunny day tests, all fate tests (ZooMutatorIT, FateMutatorImplIT, UserFateStoreIT, FateInterleavingIT, FateIT, FateOpsCommandsIT, ComprehensiveFlakyFateIT, DeleteRowsFlakyFateIT, ManagerRepoIT, and MultipleStoresIT (new test added)) except FateStoreIT. FateStoreIT reflects a method that has been removed in this PR, and want to receive feedback about the removal of this method before changes are made to this test. Have a "TODO 4131" marking this so this is not forgotten.
@keith-turner - Addressed all review comments, this PR is ready for re-review.
See new commit bd1f17472c8d289878e19d7121f9fc9347e95596
I also fixed FateStoreIT to work with the new impl of createAndReserve(). A method was deleted in 6760035608387ae537af7050106c9645aac19aae that was no longer used which FateStoreIT relied on. Upon fixing this test, noticed that the original impl of createAndReserve expected an error to be thrown in a certain situation, but in the new implementation, an empty optional was returned instead. Fixed the new impl to match the previous impl for this case. This meant changing UFS.createAndReserve and MFS.createAndReserve.
Also verified existing Fate tests still pass: ZooMutatorIT,FateMutatorImplIT,UserFateStoreIT,FateInterleavingIT,FateIT,FateOpsCommandsIT,ComprehensiveFlakyFateIT,DeleteRowsFlakyFateIT,ManagerRepoIT,MultipleStoresIT,FateStoreIT and verified sunny day tests still pass.
I am finally looking through the test changes right now. Everything else looks good.
@keith-turner I have compiled and wrote all the follow on issues for this to be merged. This covers all the issues I marked in this PR as TODO 4131
and all the unresolved review comments.
Here is a synopsis of the follow on issues I will create (so far - still need some questions answered):
- Refactor MultipleStoresIT to function more similarly to other Fate tests like FateIT
- Add FateKey toString() method
- Move MetaFateStore to org.apache.accumulo.core.fate.zookeeper
- Deprecate AbstractFateStore.createDummyLockID(): this includes creating a path in ZK for utilities to get a ZK lock, changing
admin fate fail
andadmin fate delete
commands to get a LockID at this path and no longer require the Manager to be down, and only use a LockID for a store if write operations are expected: the store should fail on write operations if writes are not expected. - Replace AFS.verifyReserved with a condition on the RESERVATION_COLUMN to verify that it is reserved
- Make WorkFinder and the TransactionRunners critical to the Manager
- Refactor how the RESERVATION_COLUMN works for UserFateStore: create/delete column on reserve/unreserve
- Additional fate test case from https://github.com/apache/accumulo/pull/4524#discussion_r1702301996
Here are some questions I still have:
- This was an existing TODO that you had marked in your "distributed FATE" WIP PR:
// TODO 4131
// TODO make the max time a function of the number of concurrent callers, as the number of
// concurrent callers increases then increase the max wait time
// TODO could support signaling within this instance for known events
// TODO made the maxWait low so this would be responsive... that may put a lot of load in
// the case there are lots of things waiting...
I made the max wait = curr num callers in seconds. Does this address the first TODO? For the other two TODOs, I was not sure what was to be done, so I left them. Can these be safely deleted without creating follow on issue(s)? If follow on issues should be made, perhaps it would be best for you to write these or explain it here because I am not sure what should be done.
-
Re one of your review comments: "Not a change to make in this PR. Thinking if we push all of the Fate data into a single node it would be nice to use JSON. That would make the data stored in ZK more human readable and it would make it easier to serialize and de-serialize." Could you elaborate on what you mean by "all of the Fate data into a single node". By "all" do you mean the Repos, TxInfo, Status, Reservation, and FateKey would be in one ZK node? Writing the issue, I want to be sure I'm understanding correctly.
-
Can https://github.com/apache/accumulo/pull/4524#discussion_r1663064467 be resolved? Or should an issue be made?
-
There was a TODO regarding whether the ZooUtil.LockID was the best type for the lock. Should I make a follow on issue regarding this or is ZooUtil.LockID okay?
-
Can this https://github.com/apache/accumulo/pull/4524#discussion_r1663064467 be resolved or should a follow on issue be made?
-
Can these be resolved: https://github.com/apache/accumulo/pull/4524#discussion_r1670934639, https://github.com/apache/accumulo/pull/4524#discussion_r1695958819
-
The DeadReservationCleaner runs every 30 seconds. Is this okay? Should this be longer? Shorter?
Once the above questions are answered, I can complete my full list of new follow-on issues. This will allow me to remove all the TODO 4131
in this PR and all the comments in this PR can be resolved as all of these will be addressed or have a follow on issue created for them.
I resolved a few of the comments.
- There was a TODO regarding whether the ZooUtil.LockID was the best type for the lock. Should I make a follow on issue regarding this or is ZooUtil.LockID okay?
That type seems like the best existing type to use, so I do not think an issue is needed.
The DeadReservationCleaner runs every 30 seconds. Is this okay? Should this be longer? Shorter?
30 seconds seems too short to me. Would be doing a lot of work and usually never finding anything, should probably be longer maybe 2 or 3 minutes.
Could you elaborate on what you mean by "all of the Fate data into a single node". By "all" do you mean the Repos, TxInfo, Status, Reservation, and FateKey would be in one ZK node? Writing the issue, I want to be sure I'm understanding correctly.
I was thinking of putting all data that is related to a single fate id into a single zk node. This will allow atomic updates. It will be slower (always have to rewrite all repos when adding/removing). We are only using it for the metadata table. So thinking it will make things more correct at the expense of speed but that tradeoff is good for metadata related fates.
Can these be safely deleted without creating follow on issue(s)? If follow on issues should be made, perhaps it would be best for you to write these or explain it here because I am not sure what should be done.
Yeah you can delete those. I will look into it and open an issue if needed.
@kevinrr888 it seems like the only unresolved conversations are covered by your list of issues to open. I just left those open for now.
@keith-turner - The TODO 4131
comments have now been removed and I believe this PR is ready to be merged :tada:
I was planning on creating the follow-on issues after this is merged in as a batch of new issues.
I will post a list of all the outstanding changes to be made below. The trivial ones, I will just make a PR for without creating an issue, the more involved follow-ons will be tracked in an issue.
Here is the list of changes to be made:
To be tracked by issue:
- [x] Refactor MultipleStoresIT to function more similarly to other Fate tests like FateIT
- [x] Deprecate AbstractFateStore.createDummyLockID(): this includes creating a path in ZK for utilities to get a ZK lock, changing
admin fate fail
andadmin fate delete
commands to get a LockID at this path and no longer require the Manager to be down, and only use a LockID for a store if write operations are expected: the store should fail on write operations if writes are not expected. - [x] A single ZK node for all fate data for each fate id
- [x] Replace AFS.verifyReserved with a condition on the RESERVATION_COLUMN to verify that it is reserved
- [x] Make WorkFinder and the TransactionRunners critical to the Manager
- [x] Refactor how the RESERVATION_COLUMN works for UserFateStore: create/delete column on reserve/unreserve
Will do in one or a few small PRs:
- [x] Add a toString() to FateKey
- [x] Move MetaFateStore to org.apache.accumulo.core.fate.zookeeper
- [x] Periodic clean up of dead reservations should be increased from every 30 seconds to every few minutes
- [x] Add additional fate test case suggested in https://github.com/apache/accumulo/pull/4524#discussion_r1702301996
Resolved the merge conflicts with AbstractFateStore
using the new CountDownTimer
.
Also noticed a sync block that should have been removed so removed that.
AbstractFateStore
should probably be looked at again before this is merged.
@kevinrr888 -
This was an existing TODO that you had marked in your "distributed FATE" WIP PR: // TODO 4131 // TODO make the max time a function of the number of concurrent callers, as the number of // concurrent callers increases then increase the max wait time // TODO could support signaling within this instance for known events // TODO made the maxWait low so this would be responsive... that may put a lot of load in // the case there are lots of things waiting... I made the max wait = curr num callers in seconds. Does this address the first TODO? For the other two TODOs, I was not sure what was to be done, so I left them. Can these be safely deleted without creating follow on issue(s)? If follow on issues should be made, perhaps it would be best for you to write these or explain it here because I am not sure what should be done.
@keith-turner -
Yeah you can delete those. I will look into it and open an issue if needed.
This may still need to be looked into just posting this as a reminder just in case