-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Fully Compatible
-
ALL
-
-
Execution Team 2020-01-13, Execution Team 2020-01-27
-
17
The sequence is as follows:
On the primary node:
- Start an index build which performs the ready: false write for an index
- The user aborts the index build with killop on the command thread
- This signals, but does not synchronously abort the index build
- The node steps down (i.e. cannot accept writes)
- The index build thread cleans up and removes the index entry from the catalog. Because it cannot generate an "abortIndexBuild" oplog entry, it tears down the index build with a ghost timestamp.
- The new primary succeeds in building the index. It commits index build, replicating the "commitIndexBuild" oplog entry.
- The old primary (now secondary) applies the oplog entry. Because it already aborted its index build, it fails to look up the build UUID while committing. Unfortunately, we suppresses the NoSuchKey error, so the oplog batch fails silently.
- This leaves a state where the old primary is missing an index that is present on the new primary.
See this patch build.
- is related to
-
SERVER-45347 2-phase index build on empty collection(implicit collection creation) can skip rebuilding the index on rollback/startup recovery.
- Closed
-
SERVER-45382 indexbg_restart_secondary.js can race processing a commitIndexBuild oplog entry and interrupting an index build due to shutdown
- Closed
-
SERVER-45905 abortIndexBuild oplog entry should ignore NamespaceNotFound errors
- Closed
-
SERVER-45921 Index builder invariants on this check (indexSpecs.size() > 1) while trying to start building index.
- Closed
-
SERVER-45933 2 phase index build running with maxTimeMS can lead to undesirable behavior like server crash.
- Closed