fix: mongodb multiple error rollback transaction
Description
fixes https://github.com/payloadcms/payload/issues/4133
When one transaction is used where multiple operations error, the result is that rollbackTransaction is called more than once. If this is happening async the second call to rollbackTransaction would hang on await sessions[id].abortTransaction().
This change makes it non-blocking and the function will not error when called multiple times.
PR also hs a type change to in Payload so the database interface allows the return type for rollbackTransaction to not be async.
- [x] I have read and understand the CONTRIBUTING.md document in this repository.
Type of change
- [x] Bug fix (non-breaking change which fixes an issue)
Checklist:
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] Existing test suite passes locally with my changes
- [ ] I have made corresponding changes to the documentation
When one transaction is used where multiple operations error, the result is that rollbackTransaction is called more than once.
This sounds very strange to me. Only one operation should be responsible for rolling back or committing the transaction, right? How is rollbackTransaction called more than once with the same transaction?
After cloning the reproduction repository from #4133 I saw that it was using Payload 2.1.1, which does not include #4164. In that reproduction I am seeing seeing an issue in commitTransaction, called from the docAccess operation (both when I reproduce it locally as well as in the stacktrace in the issue). That is exactly what I fixed in #4164 with this: https://github.com/payloadcms/payload/pull/4164/files#diff-9d83c8ca3813b690c6a9b61f1e4fd57959d364ee5d9530eb1013c9f10d5f85b3L37-R57, this makes sure that all operations nested within docAccess do not start their own transaction (creating a big race over req.transactionID) but instead just run in one big one.
When updating the reproduction repo to Payload 2.2.1, the reported issue goes away.
After all this: What exactly is this now fixing, if not #4133? :D
When one transaction is used where multiple operations error, the result is that rollbackTransaction is called more than once.
This sounds very strange to me. Only one operation should be responsible for rolling back or committing the transaction, right? How is
rollbackTransactioncalled more than once with the same transaction?
See this test case: https://github.com/payloadcms/payload/pull/4245/files#diff-2a01d79fb3010bfe668ca296a4b7e534f28386e38f8fa1fa379dd309ed0f67c1R173
I'm firing off two payload.create calls that will both fail. This will cause rollbackTransaction to be called more than once. This should help if projects have complex hooks or access controls that are failing and the transaction rollback bug makes it hard to see the actual errors.
After cloning the reproduction repository from #4133 I saw that it was using Payload 2.1.1, which does not include #4164. In that reproduction I am seeing seeing an issue in commitTransaction, called from the
docAccessoperation (both when I reproduce it locally as well as in the stacktrace in the issue). That is exactly what I fixed in #4164 with this: #4164 (files), this makes sure that all operations nested withindocAccessdo not start their own transaction (creating a big race overreq.transactionID) but instead just run in one big one.When updating the reproduction repo to Payload 2.2.1, the reported issue goes away.
After all this: What exactly is this now fixing, if not #4133? :D
This is all great. I am glad that your changes addressed #4133.
The other thing this PR does is prevent endSession() errors from causing another throw. I found that DocumentDB specifically does not support it.
https://docs.aws.amazon.com/documentdb/latest/developerguide/mongo-apis.html#mongo-apis-dababase-sessions
I'm firing off two payload.create calls that will both fail. This will cause rollbackTransaction to be called more than once.
I'm sorry to keep going about this, but I would rather we get this correct.
If I understand everything correctly, the test case you linked will never roll back a transaction and if this test passes (with transactions enabled) then we have a bug (or I have a fundamental misunderstanding). Assuming transactions are enabled:
await initTransaction(req)happens, nowreq.transactionIDis set.payload.createis called to createfirst. This sees the existingtransactionIDthus will never commit or roll back (itsinitTransactionwill return false).- The same goes for both
payload.createcalls that go intopromises. - Any errors from those two
createcalls are entirely ignored ("// catch errors and carry on"). At this point nothing should have committed or rolled back the transaction. expect(req.transactionID).toBeFalsy()should always fail, because the transaction is still aliveawait commitTransaction(req)- now we commit
If there are multiple operations, we must either start a transaction beforehand or have them use separate request objects (making them run in separate transactions)
Yup, this is exactly what the test is doing: await initTransaction(req) https://github.com/payloadcms/payload/pull/4245/files#diff-2a01d79fb3010bfe668ca296a4b7e534f28386e38f8fa1fa379dd309ed0f67c1R155
This is how synchronous operations should be handled and now there is a test case for it. Without the changes to rollbackTransaction this test was hanging indefinitely. How else would we handle this?
It would be great to have a useful error message if we can possibly detect this race condition, explaining what went wrong and telling users how to fix it.
I did have logger.warns in earlier releases. Unfortunately they were not helpful and only caused more confusion. If you have some fresh ideas I'd like to hear them.
@diesieben07 are you against merging this PR, do you see it causing other issues?
EDIT: I didn't see your reply above:
If I understand everything correctly, the test case you linked will never roll back a transaction and if this test passes (with transactions enabled) then we have a bug (or I have a fundamental misunderstanding).
This test case does rollback and the test passes. It is encapsulating multiple changes in one transaction. I think you maybe missed that.
Okay... I found both my misunderstanding and the bug. All operations unconditionally call killTransaction without checking shouldCommit. This is not correct. The operation must only touch the transaction if initTransaction returned true.
Consider the scenario from your test case, where multiple operations run concurrently within the same transaction. Neither of them is "in charge" of the transaction, it was started before (like in your test case). Now one of the operations encounters an error and calls killTransaction. However it has no idea if the other operations (that are running in parallel with it!) are done yet or if they are still using the transaction. You might have the situation that killTransaction is completed, but afterwards one of the other (slower) operations still writes things to the database. Those writes will now use a new transaction (because req.transactionID was removed by killTransaction). Now you have the situation where an operation is rolled back partially!
killTransaction must be handled just like commitTransaction. Only who started the transaction must commit or roll it back.
Edit: Sorry for just removing my earlier reply. I had hoped I was quick enough so it wouldn't be seen. I was entirely mistaken in that reply, so I deleted it.