-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 4.1.1
-
Component/s: Replication
-
Fully Compatible
-
ALL
-
Repl 2018-08-13, Repl 2018-08-27
-
28
When we run commitTransaction, we will call theĀ Session::commitUnpreparedTransaction method (or the analogous method for prepared transactions). Inside this method, we will first transition to state "kCommittingWithoutPrepare", and then trigger the onTransactionCommit OpObserver. Inside that OpObserver call is where we will do a write to update the config.transactions table. If the the onTransactionCommit method throws an exception, then the commitTransaction command will fail, and we will have left the transaction state in "kCommittingWithoutPrepare". When a command running inside a transaction throws an exception, we will trigger this block, to abort the transaction if necessary. In the case described, we would call Session::abortActiveTransaction while the transaction is still in state "kCommittingWithoutPrepare". Since we are not in one of the expected states passed to _abortActiveTransaction, we will not execute the _abortTransactionOnSession method, which is what actually updates the various metadata about the transaction, to indicate that it is aborted. We will, however, clean up the transaction resources that live on the OperationContext. So, even though we called abortActiveTransaction, we never actually transitioned to the "kAborted" state.
The issue can then persist, because the transaction has been left in the "kCommittingWithoutPrepare" state. For example, when we try to run another commit command, we will get an error because the transaction is no longer marked as in-progress. The same error will also be thrown if we try to run abort. One way to get the transaction out of this "limbo" state is to start a new transaction with a higher transaction number on the same session. This will work as a way to clear out the old transaction state, but it still won't trigger an actual call to _abortTransactionOnSession for the previous transaction. When we start a new transaction when one is already running, we will only abort the old transaction if there is one in progress. This means we would start the new transaction without ever explicitly calling the abort method internally.
To fix this, we should probably make sure that we explicitly abort the transaction right away if an exception is thrown inside the OpObserver.
- is depended on by
-
SERVER-36295 Transaction metrics not updated on TransientTransactionError
- Closed