-
Type: Task
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Index Maintenance, Storage
-
Fully Compatible
-
Storage NYC 2019-03-11
-
(copied to CRM)
-
22
Original text:
Test setup:
- Multi shard
- collection is sharded with _id: hashed
The evergreen validateCollection hook fails semi-frequently after running the background_index_multikey.js test. Based on initial investigation, it looks like one of the indexes in the secondary is missing a key entry. The primary appears to have passed the validation. Note: test is currently blacklisted.
Because transactions yield their locks on secondaries (SERVER-37199), a concurrent hybrid background index build can conflict in a way that leads to lost writes into building indexes (i.e. corruption) on secondaries.
In this example, a background index build on {a: 1} is concurrent with an insert of a document {a: 0} in a transaction while applied on a secondary.
- The background index build converts its X lock to an IX lock while collection scanning. It creates a temporary side-writes table to accept all index key insertions during the build.
- A document {a: 0}is inserted in a transaction and prepared. The key for a: 0 is inserted into the side-writes table as part of the same transaction. When applied on a secondary, it drops its IX locks.
- The background index build takes an X lock, uncontested, and drains the side-writes table. Because the insert into the side-writes table was part of a prepared, but uncommitted transaction, it is invisible to the index builder. The table is then dropped on completion.
- The transaction is finally committed, but its side-write is committed to a now-deleted table. The inserted key is now lost forever and the resulting index is corrupted.
On a primary, our locks prevent this from happening, but because an index build can complete while a prepared transaction is active, we can lose writes into building indexes.
edit: louis.williams
- is caused by
-
SERVER-37199 Yield locks of transactions in secondary application
- Closed
- is related to
-
SERVER-38550 Mobile storage engine should support dupsAllowed mode with bulk builders
- Closed
- related to
-
SERVER-37336 Test that background index build do not block on prepared transactions on secondaries
- Closed
-
SERVER-39372 Make secondary lock acquisition for DDL operations consistent with behavior on primary
- Closed
-
SERVER-40723 Deadlock between S lock acquisition on secondary and prepare conflict
- Closed
-
SERVER-38540 Unblacklist multi index tests in multi_shard_multi_stmt passthrough suite
- Closed
-
SERVER-40041 block prepared transactions behind index builds during initial sync
- Closed
-
SERVER-43638 Do not block prepared transactions on two-phase index builds on secondaries
- Closed