-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Replication
-
ALL
-
147
In the vectored insert code path,
1) We first attempt to insert as a batch.
2) If that fails, for example, due to a WriteConflictException (WCE), we try to insert the batch one at a time.
3) And if that insert also fails due to a WCE, we retry that insertion using writeConflictRetry loop.
If an FCV upgrade happens between such retry write attempt (between 1 & 2 or 2&3) , it could result in fatal error of committing a write with a timestamp older than the stable timestamp.
[j1:s0:prim] | 2024-03-27T12:21:55.232+01:00 E WT 22435 [S] [conn440] "WiredTiger error message","attr":{"error":22,"message":{"ts_sec":1711538515,"ts_usec":231061,"thread":"5756:140729299129264","session_name":"WT_SESSION.timestamp_transaction_uint","category":"WT_VERB_DEFAULT","category_id":12,"verbose_level":"ERROR","verbose_level_id":-3,"msg":"int __cdecl __wt_txn_validate_commit_timestamp(struct __wt_session_impl *,unsigned __int64 *):566:commit timestamp (1711538514, 10) must be after the stable timestamp (1711538514, 70)","error_str":"Invalid argument","error_code":22}}
replicateVectoredInsertsTransactionally feature flag is enabled in 8.0 (SERVER-77881). Now consider the below scenario
1) Node is in 8.0 binary + FCV 7.0, meaning the replicateVectoredInsertsTransactionally feature flag is disabled.
2) User tries a bulk insert of 3 documents [{_id:1}, {_id:2}, {_id:3}].
3) Initially, the server tries to write them in a batch. Since replicateVectoredInsertsTransactionally is disabled, we will allocate oplog slots and update each statement in InsertStatement vector to include oplog slot's timestamp. In this case 3 oplog slots will be allocated and InsertStatments will look like [{doc:{_id:1} , oplogSlot :TS(10)}, {doc:{_id:2} , oplogSlot :TS(20)}, {doc:{_id:3} , oplogSlot :TS(30)}].
4) However, the batched insert fails with a WriteConflictException (WCE). Subsequently, batch will be attempted to insert one-at-a-time using the closed oplogSlot's Timestamp (Note: When step 3 fails, the associated WUOW gets aborted, causes the oplog slots (TS(10), TS(20) & TS(30)) to close).
5) Meanwhile, the FCV upgrades to 8.0 at TS(40), meaning the replicateVectoredInsertsTransactionally feature flag is enabled.
6) Then, stable ts advances to TS(40).
7) Now, with replicateVectoredInsertsTransactionally enabled, oplog slots won't be reallocated. So, it will try to insert the InsertStatement <{doc:{_id:1} , oplogSlot :TS(10)}>. This violates the WT stable-commit timestamp rule as commit ts TS(10) < stable TS (40).
- duplicates
-
SERVER-88690 Clear assigned optimes when an insert batch fails
- Closed