-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Component/s: Transactions
-
None
-
Needed
Summary
Enable transactions which attempt to commit in the midst of the transactionLifetimeLimitSeconds expiring to automatically and correctly retry.
Motivation
It is possible for the commitTransaction command to commit the storage transacton and then be interrupted with an TransactionExceededLifetimeLimitSeconds error code (= 290). The commitTransaction command tends to be very quick so this is unlikely to occur in practice but has been observed to happen at least once. The sequence of events would probably be something along the following lines:
- Client starts transaction
- Clients runs inserts/update/finds/etc. for 59 seconds
- Client runs commitTransaction
- Server commits the storage transaction
- Server interrupts the commitTransaction operation before it responds back to the client
The TransactionExceededLifetimeLimitSeconds error response would bubble up as an application error. It would be preferable and completely safe for the client to retry the commitTransaction command to learn the definitive result of the transaction. Drivers should therefore labels the error with UnknownTransactionCommitResult and to have withTransaction() automatically retry the commitTransaction command, and possibly automatically retry the entire transaction if the retried commitTransaction command returns an error with the TransientTransactionError label (e.g. NoSuchTransaction).
Who is the affected end user?
Users of transactions.
How does this affect the end user?
The server's error response is propagated as an application error.
How likely is it that this problem or use case will occur?
Unlikely to occur.
If the problem does occur, what are the consequences and how severe are they?
An application developer may attempt to handle the resulting application error on their own. The retry loops for safely retrying transactions are challenging to get correct. An application developer is potentially able to create "doubling spending" scenarios where the entire transaction is executed a second time incorrectly because the first attempt had actually already succeeded and the application didn't attempt to learn the first result.
Is this issue urgent?
Not urgent because it is already a problem since MongoDB 4.0.
Is this ticket required by a downstream team?
No.
Is this ticket only for tests?
No.
- is related to
-
SERVER-42821 Improve error message when transaction is killed due to exceeding timeout
- Closed