Renaming collections across databases is not a simple rename, but rather a process of:
- Creating a temp collection on the destination database + _id index.
- Build secondary indexes on the temp collection.
- Insert documents to the temp collection.
- Rename the temp collection to the desired destination.
Copying the index definitions over uses a single MultiIndexBlock. All of the indexes are generated with `ready: false` writes in one WUOW and a single timestamp from a noop oplog entry. However, committing the `ready: true` writes has the following sequence (for demonstration, suppose two secondary indexes, A and B):
- Begin WT transaction.
- Set index A to ready.
- Set index B to ready.
- Write oplog entry creating A.
- Set timestamp 1.
- Write oplog entry creating B.
- Set timestamp 2.
- Commit WT transaction.
In this case, both `ready: true` writes are given timestamp 2. Rolling back inbetween these times will see both indexes as `ready: false`, but replication recovery will only rebuild index B.
This is an analogous bug to SERVER-35070.
This ticket should consider adding the following invariant right before here:
invariant(_indexes.size() == 1 || onCreateFn);
- related to
-
SERVER-38745 MigrationDestinationManager assigns incorrect timestamps while building multiple indexes
- Closed