Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-45933

2 phase index build running with maxTimeMS can lead to undesirable behavior like server crash.

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.4.0-rc2, 4.7.0
    • Affects Version/s: None
    • Component/s: Storage
    • None
    • Fully Compatible
    • ALL
    • v4.4
    • Hide

      Base commit: c5cc18dd7484867d82959fc221eeb42efae94255

      diff --git a/jstests/replsets/two_phase_index_build_maxtimems.js b/jstests/replsets/two_phase_index_build_maxtimems.js
      new file mode 100644
      index 0000000000..40382f7d54
      --- /dev/null
      +++ b/jstests/replsets/two_phase_index_build_maxtimems.js
      @@ -0,0 +1,55 @@
      +/*
      + * Test shows that 2 phase index build with maxTimeMs can cause server crash.
      + */
      +load('jstests/libs/parallelTester.js');
      +load("jstests/libs/curop_helpers.js");  // for waitForCurOpByFailPoint().
      +load("jstests/replsets/rslib.js");
      +load('jstests/noPassthrough/libs/index_build.js');
      +
      +(function() {
      +
      +"use strict";
      +
      +const testName = "readOpsDuringStepDown";
      +const dbName = "test";
      +const collName = "coll";
      +
      +var rst = new ReplSetTest({name: testName, nodes: [{}, {rsConfig: {priority: 0}}]});
      +rst.startSet();
      +rst.initiate();
      +
      +const primary = rst.getPrimary();
      +const primaryDB = primary.getDB(dbName);
      +const primaryAdmin = primary.getDB("admin");
      +const primaryColl = primaryDB[collName];
      +const collNss = primaryColl.getFullName();
      +
      +TestData.dbName = dbName;
      +TestData.collName = collName;
      +
      +jsTestLog("Do a document write");
      +assert.commandWorked(
      +        primaryColl.insert({_id: 0, x: 0}, {"writeConcern": {"w": "majority"}}));
      +rst.awaitReplication();
      +
      +// This fail point will delay 4 sec delay in index build. But, we are going to
      +// create index with maxTimeMS set as 1 sec.
      +jsTestLog("Make the  index builder to hang before building index");
      +const hangBeforeBuildingIndexFailpointTimesEntered = assert.commandWorked(primary.adminCommand(
      +    {configureFailPoint: 'hangBeforeBuildingIndex', mode: 'alwaysOn'})).count;
      +
      +
      +const joinFindThread = startParallelShell(() => {
      +    jsTestLog("Create index with maxTimeMS set");
      +    assert.commandWorked(db.getSiblingDB(TestData.dbName).runCommand(
      +        {createIndexes: TestData.collName, indexes: [{name: "x_1", key: {x: 1}}], maxTimeMS: 1 * 1000}));
      +}, primary.port);
      +
      +// Wait for index build to reach the fail point.
      +checkLog.contains(primary, "Bulk builder hung before building index");
      +
      +// Induce state transition.
      +assert.commandWorked(primary.adminCommand({"replSetStepDown": 100, "force": true}));
      +
      +rst.stopSet();
      +})();
      diff --git a/src/mongo/db/index_builds_coordinator.cpp b/src/mongo/db/index_builds_coordinator.cpp
      index 77c667a42d..2024267339 100644
      --- a/src/mongo/db/index_builds_coordinator.cpp
      +++ b/src/mongo/db/index_builds_coordinator.cpp
      @@ -68,6 +68,7 @@ using namespace indexbuildentryhelpers;
       MONGO_FAIL_POINT_DEFINE(hangAfterIndexBuildFirstDrain);
       MONGO_FAIL_POINT_DEFINE(hangAfterIndexBuildSecondDrain);
       MONGO_FAIL_POINT_DEFINE(hangAfterIndexBuildDumpsInsertsFromBulk);
      +MONGO_FAIL_POINT_DEFINE(hangBeforeBuildingIndex);
      
       namespace {
      
      @@ -1531,6 +1532,11 @@ void IndexBuildsCoordinator::_runIndexBuildInner(OperationContext* opCtx,
               // so do it in the try-catch block
               AutoGetDb autoDb(opCtx, replState->dbName, MODE_IX);
      
      +        if (MONGO_unlikely(hangBeforeBuildingIndex.shouldFail())) {
      +            log() << "Bulk builder hung before building index";
      +            mongo::sleepmillis(3 * 1000);
      +        }
      +
               // Do not use AutoGetCollection since the lock will be reacquired in various modes
               // throughout the index build. Lock by UUID to protect against concurrent collection rename.
               boost::optional<Lock::CollectionLock> collLock;
      
      Show
      Base commit: c5cc18dd7484867d82959fc221eeb42efae94255 diff --git a/jstests/replsets/two_phase_index_build_maxtimems.js b/jstests/replsets/two_phase_index_build_maxtimems.js new file mode 100644 index 0000000000..40382f7d54 --- /dev/null +++ b/jstests/replsets/two_phase_index_build_maxtimems.js @@ -0,0 +1,55 @@ +/* + * Test shows that 2 phase index build with maxTimeMs can cause server crash. + */ +load('jstests/libs/parallelTester.js'); +load("jstests/libs/curop_helpers.js"); // for waitForCurOpByFailPoint(). +load("jstests/replsets/rslib.js"); +load('jstests/noPassthrough/libs/index_build.js'); + +(function() { + +"use strict"; + +const testName = "readOpsDuringStepDown"; +const dbName = "test"; +const collName = "coll"; + +var rst = new ReplSetTest({name: testName, nodes: [{}, {rsConfig: {priority: 0}}]}); +rst.startSet(); +rst.initiate(); + +const primary = rst.getPrimary(); +const primaryDB = primary.getDB(dbName); +const primaryAdmin = primary.getDB("admin"); +const primaryColl = primaryDB[collName]; +const collNss = primaryColl.getFullName(); + +TestData.dbName = dbName; +TestData.collName = collName; + +jsTestLog("Do a document write"); +assert.commandWorked( +        primaryColl.insert({_id: 0, x: 0}, {"writeConcern": {"w": "majority"}})); +rst.awaitReplication(); + +// This fail point will delay 4 sec delay in index build. But, we are going to +// create index with maxTimeMS set as 1 sec. +jsTestLog("Make the index builder to hang before building index"); +const hangBeforeBuildingIndexFailpointTimesEntered = assert.commandWorked(primary.adminCommand( + {configureFailPoint: 'hangBeforeBuildingIndex', mode: 'alwaysOn'})).count; + + +const joinFindThread = startParallelShell(() => { + jsTestLog("Create index with maxTimeMS set"); + assert.commandWorked(db.getSiblingDB(TestData.dbName).runCommand( + {createIndexes: TestData.collName, indexes: [{name: "x_1", key: {x: 1}}], maxTimeMS: 1 * 1000})); +}, primary.port); + +// Wait for index build to reach the fail point. +checkLog.contains(primary, "Bulk builder hung before building index"); + +// Induce state transition. +assert.commandWorked(primary.adminCommand({"replSetStepDown": 100, "force": true})); + +rst.stopSet(); +})(); diff --git a/src/mongo/db/index_builds_coordinator.cpp b/src/mongo/db/index_builds_coordinator.cpp index 77c667a42d..2024267339 100644 --- a/src/mongo/db/index_builds_coordinator.cpp +++ b/src/mongo/db/index_builds_coordinator.cpp @@ -68,6 +68,7 @@ using namespace indexbuildentryhelpers; MONGO_FAIL_POINT_DEFINE(hangAfterIndexBuildFirstDrain); MONGO_FAIL_POINT_DEFINE(hangAfterIndexBuildSecondDrain); MONGO_FAIL_POINT_DEFINE(hangAfterIndexBuildDumpsInsertsFromBulk); +MONGO_FAIL_POINT_DEFINE(hangBeforeBuildingIndex); namespace { @@ -1531,6 +1532,11 @@ void IndexBuildsCoordinator::_runIndexBuildInner(OperationContext* opCtx, // so do it in the try-catch block AutoGetDb autoDb(opCtx, replState->dbName, MODE_IX); + if (MONGO_unlikely(hangBeforeBuildingIndex.shouldFail())) { + log() << "Bulk builder hung before building index"; + mongo::sleepmillis(3 * 1000); + } + // Do not use AutoGetCollection since the lock will be reacquired in various modes // throughout the index build. Lock by UUID to protect against concurrent collection rename. boost::optional<Lock::CollectionLock> collLock;
    • Execution Team 2020-04-06, Execution Team 2020-04-20

      Currently, When we run 2 phase index build with maxTimeMS, it can lead to below undesirable behaviors.
      Server crash due to fassert:
      When primary gets createIndex cmd with maxTimeMS set, we start building index with maxTimeMS/deadline set on that indexBuildCoordinator thread opCtx.

      Now, assume, index builder thread tries to acquire a lock whose opCtx deadline got expired. This would result in throwing with ErrorCodes::MaxTimeMSExpired. On error, index builder would try to clean up the index build. Now, assume, the primary stepped down. This means, the clean up code path for secondary will be executed and hitting this fassert. Because, on secondaries it's illegal for index builder thread to throw some error unless it's asked to get aborted by abortIndexBuild oplog entry (exceptional is shutdown).

      Note: repro patch attached for this scenario.

      Server crash due to invariant:
      There is also ways it can lead to issue like SERVER-45921. Think of a case, where the createIndex parent thread opCtx got interrupted due to ErrorCodes::MaxTimeMSExpired. This makes the parent thread to signal index builder thread to abort. Assume, step down happened. This means, we get into clean up code path for secondary for ErrorCodes::IndexBuildAborted which will tear down the index build. As a result, new fail-over (previous secondary) node will take the responsibility of committing the index build that got aborted on previous primary. So, when the old primary get the commitIndexBuildOplog entry, we would try to restart the index build and lead to an invariant failure.

            Assignee:
            benety.goh@mongodb.com Benety Goh
            Reporter:
            suganthi.mani@mongodb.com Suganthi Mani
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: