Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Storage
Labels:
None

Operating System:
ALL
Steps To Reproduce:
Hide

Base commit: c5cc18dd7484867d82959fc221eeb42efae94255

diff --git a/jstests/noPassthrough/index_killop_after_stepdown.js b/jstests/noPassthrough/index_killop_after_stepdown.js index 67841202c0..df2a2fcbb0 100644 --- a/jstests/noPassthrough/index_killop_after_stepdown.js +++ b/jstests/noPassthrough/index_killop_after_stepdown.js @@ -31,7 +31,7 @@ let res = assert.commandWorked(primary.adminCommand( const hangAfterInitFailpointTimesEntered = res.count; res = assert.commandWorked(primary.adminCommand( - {configureFailPoint: 'hangBeforeIndexBuildAbortOnInterrupt', mode: 'alwaysOn'})); + {configureFailPoint: 'hangAfterIndexBuildAbortOnInterrupt', mode: 'alwaysOn'})); const hangBeforeAbortFailpointTimesEntered = res.count; const createIdx = IndexBuildTest.startIndexBuild(primary, coll.getFullName(), {a: 1}); @@ -57,7 +57,7 @@ try { // Wait for the command thread to abort the index build. assert.commandWorked(primary.adminCommand({ - waitForFailPoint: "hangBeforeIndexBuildAbortOnInterrupt", + waitForFailPoint: "hangAfterIndexBuildAbortOnInterrupt", timesEntered: hangBeforeAbortFailpointTimesEntered + 1, maxTimeMS: kDefaultWaitForFailPointTimeout })); diff --git a/src/mongo/db/commands/create_indexes.cpp b/src/mongo/db/commands/create_indexes.cpp index 6c69b9ddc3..76cadea82d 100644 --- a/src/mongo/db/commands/create_indexes.cpp +++ b/src/mongo/db/commands/create_indexes.cpp @@ -78,6 +78,7 @@ MONGO_FAIL_POINT_DEFINE(createIndexesWriteConflict); // collection is created. MONGO_FAIL_POINT_DEFINE(hangBeforeCreateIndexesCollectionCreate); MONGO_FAIL_POINT_DEFINE(hangBeforeIndexBuildAbortOnInterrupt); +MONGO_FAIL_POINT_DEFINE(hangAfterIndexBuildAbortOnInterrupt); constexpr auto kIndexesFieldName = "indexes"_sd; constexpr auto kCommandName = "createIndexes"_sd; @@ -1021,6 +1022,7 @@ public: } return runCreateIndexesWithCoordinator(opCtx, dbname, cmdObj, errmsg, result); } catch (const DBException& ex) { + hangAfterIndexBuildAbortOnInterrupt.pauseWhileSet(); // We can only wait for an existing index build to finish if we are able to release // our locks, in order to allow the existing index build to proceed. We cannot // release locks in transactions, so we bypass the below logic in transactions.
Show
Base commit: c5cc18dd7484867d82959fc221eeb42efae94255 diff --git a/jstests/noPassthrough/index_killop_after_stepdown.js b/jstests/noPassthrough/index_killop_after_stepdown.js index 67841202c0..df2a2fcbb0 100644 --- a/jstests/noPassthrough/index_killop_after_stepdown.js +++ b/jstests/noPassthrough/index_killop_after_stepdown.js @@ -31,7 +31,7 @@ let res = assert.commandWorked(primary.adminCommand( const hangAfterInitFailpointTimesEntered = res.count; res = assert.commandWorked(primary.adminCommand( - {configureFailPoint: 'hangBeforeIndexBuildAbortOnInterrupt', mode: 'alwaysOn'})); + {configureFailPoint: 'hangAfterIndexBuildAbortOnInterrupt', mode: 'alwaysOn'})); const hangBeforeAbortFailpointTimesEntered = res.count; const createIdx = IndexBuildTest.startIndexBuild(primary, coll.getFullName(), {a: 1}); @@ -57,7 +57,7 @@ try { // Wait for the command thread to abort the index build. assert.commandWorked(primary.adminCommand({ - waitForFailPoint: "hangBeforeIndexBuildAbortOnInterrupt", + waitForFailPoint: "hangAfterIndexBuildAbortOnInterrupt", timesEntered: hangBeforeAbortFailpointTimesEntered + 1, maxTimeMS: kDefaultWaitForFailPointTimeout })); diff --git a/src/mongo/db/commands/create_indexes.cpp b/src/mongo/db/commands/create_indexes.cpp index 6c69b9ddc3..76cadea82d 100644 --- a/src/mongo/db/commands/create_indexes.cpp +++ b/src/mongo/db/commands/create_indexes.cpp @@ -78,6 +78,7 @@ MONGO_FAIL_POINT_DEFINE(createIndexesWriteConflict); // collection is created. MONGO_FAIL_POINT_DEFINE(hangBeforeCreateIndexesCollectionCreate); MONGO_FAIL_POINT_DEFINE(hangBeforeIndexBuildAbortOnInterrupt); +MONGO_FAIL_POINT_DEFINE(hangAfterIndexBuildAbortOnInterrupt); constexpr auto kIndexesFieldName = "indexes"_sd; constexpr auto kCommandName = "createIndexes"_sd; @@ -1021,6 +1022,7 @@ public: } return runCreateIndexesWithCoordinator(opCtx, dbname, cmdObj, errmsg, result); } catch (const DBException& ex) { + hangAfterIndexBuildAbortOnInterrupt.pauseWhileSet(); // We can only wait for an existing index build to finish if we are able to release // our locks, in order to allow the existing index build to proceed. We cannot // release locks in transactions, so we bypass the below logic in transactions.
Sprint:
Execution Team 2020-02-10, Execution Team 2020-03-23, Execution Team 2020-04-06, Execution Team 2020-04-20
Linked BF Score:
40
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

When the createIndex thread marks the index build as aborted, it sets the abortTimestamp as null timestamp. So, when the indexBuildCoordinatorThread sees this aborted flag and assume a step down also happened, the stepped down primary will go into this code block. This means the index build got torn down (unregistered the index build). But, we don't remove the catalog entry i.e, the index catalog entry for the aborted index build with ready:false will present in the catalog table. Now, assume, secondary had already started the index build before the abortion event on the primary. This means, if that secondary gets elected as new primary, it can go ahead and commit the index Build. On receiving the commitIndexBuild oplog entry, the old primary (after ~~SERVER-44953~~) will restart and try the initialize the index build . But, then since the catalog has the index entry with ready:false (representing in-progress/unfinished index build), this invariant check fails leading to crash.

duplicates

SERVER-46560 Make Abort index build logic deterministic.

Closed

is related to

SERVER-45933 2 phase index build running with maxTimeMS can lead to undesirable behavior like server crash.

Closed

related to

SERVER-44953 Secondaries should restart index builds when a commitIndexBuild oplog entry is processed but no index build is active

Closed

SERVER-45916 On primary, 2-phase index build cleanup writes an abortIndexBuild oplog entry under a stronger mode user collection lock X which can lead to 3 way deadlock with prepared transactions, step down and index build

Closed

SERVER-46560 Make Abort index build logic deterministic.

Closed

SERVER-46012 Aborting index builders through the IndexBuildsCoordinator does not always abort the index builders

Closed

(1 related to)

Assignee:: Gregory Wlodarek
Reporter:: Suganthi Mani
Participants:: Gregory Wlodarek, Louis Williams, Suganthi Mani
Votes:: 0 Vote for this issue
Watchers:: 5 Start watching this issue

Created:: Feb 01 2020 12:16:09 AM UTC
Updated:: Apr 10 2020 02:21:48 PM UTC
Resolved:: Apr 10 2020 02:21:48 PM UTC
Confidence Status Last Update:: 19/Feb/20 2:51 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates