accumulo icon indicating copy to clipboard operation
accumulo copied to clipboard

Modified FATE to allow transactions to run to completion without interruption

Open dlmarion opened this issue 2 years ago • 8 comments

FATE processes transactions in an interleaved piece-wise manner. In the case of a multi-step transaction, FATE could potentially execute a step of another transaction before executing the next step of the current transaction. This change enables the developer to indicate via a new method on the Repo interface that the transaction type should be run start to finish without being interrupted.

dlmarion avatar Oct 16 '23 19:10 dlmarion

This is draft because I want to get early feedback on approach. I'm going to add some tests for some failure conditions.

dlmarion avatar Oct 16 '23 19:10 dlmarion

This approach will tie up a FATE thread. Another possible approach is to have a ZooStore.reserve() prefer transactions that are in an IN_PROGRESS state over those that are SUBMITTED (those have never run). However it seems like scanning for in progress work could do a lot more ZK reads.

keith-turner avatar Oct 16 '23 19:10 keith-turner

This approach will tie up a FATE thread.

It will, but only for those transactions that are marked so

Another possible approach is to have a ZooStore.reserve() prefer transactions that are in an IN_PROGRESS state over those that are SUBMITTED (those have never run)

That would allow transactions to still interleave. When discussing this, I thought you had given me the example of a split task, that you wanted to finish as fast as possible.

dlmarion avatar Oct 16 '23 20:10 dlmarion

That would allow transactions to still interleave. When discussing this, I thought you had given me the example of a split task, that you wanted to finish as fast as possible.

That is not exactly what I was thinking. Was thinking of prioritizing FATEs that have started running over those that have never run when choosing what to run next.

I like the approach in this PR, except that it ties up a thread and could lead all FATE threads sleeping in the worst case.

keith-turner avatar Oct 16 '23 20:10 keith-turner

That is not exactly what I was thinking. Was thinking of prioritizing FATEs that have started running over those that have never run when choosing what to run next.

Ah, ok, I didn't get that from our conversation.

I like the approach in this PR, except that it ties up a thread and could lead all FATE threads sleeping in the worst case.

Yeah, I'm curious how many FATE ops are going to take advantage of this. I'm assuming SPLIT and MERGE as they require the tablet to be un-hosted, so you want them to run as fast as possible so that the tablets can be re-hosted. It would be trivial to put in an escape path, say if we have received a non-zero response from isReady N times in a row.

dlmarion avatar Oct 16 '23 20:10 dlmarion

@dlmarion You requested my review on this, but I feel like it's a bit outside my area of expertise. I'd be interested in a discussion sometime, to get a high level understanding of it, though, along with what's motivating it. Right now, I don't have any opinions on the approach one way or another.

ctubbsii avatar Oct 20 '23 09:10 ctubbsii

@dlmarion You requested my review on this, but I feel like it's a bit outside my area of expertise. I'd be interested in a discussion sometime, to get a high level understanding of it, though, along with what's motivating it. Right now, I don't have any opinions on the approach one way or another.

@ctubbsii - I wasn't sure if you were familiar with the FATE code or not. I figured you may also have some input on the behavior change.

dlmarion avatar Oct 23 '23 12:10 dlmarion

@dlmarion I might have some thoughts, but I think I'd need it explained to me at an elementary level first. Mostly, though, I'm just curious to learn a bit more about where you're headed with this. I'm thinking maybe we can post another video meet in Slack some time, and discuss it with anybody interested in joining.

ctubbsii avatar Nov 21 '23 22:11 ctubbsii

Superceded by #4589

dlmarion avatar May 24 '24 12:05 dlmarion