Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-38855

Noop write for afterClusterTime/atClusterTime does not occur on config servers

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.1.9
    • Affects Version/s: 3.6.9, 4.0.5, 4.1.6
    • Component/s: Sharding
    • Fully Compatible
    • ALL
    • v4.0, v3.6
    • Hide

      Apply this patch, which turns off all the background threads (the periodic noop writer defaults to off for the noPassthrough suite for mongods started through resmoke.py) and modifies a jstest that tests that noop writes are done for snapshot readConcern against a shard to instead test against a config server:

      diff --git a/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js b/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js
      index b74832e..1ba4ec4 100644
      --- a/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js
      +++ b/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js
      @@ -44,14 +44,14 @@
       
           // Propagate 'clusterTime' to shard 1. This ensures that its next write will be at time >=
           // 'clusterTime'.
      -    testDB1.coll1.find().itcount();
      +    st.s.getDB("config").coll1.find().itcount();
       
           // Attempt a snapshot read at 'clusterTime' on shard 1. Test that it performs a noop write to
           // advance its lastApplied optime past 'clusterTime'. The snapshot read itself may fail if the
           // noop write advances the node's majority commit point past 'clusterTime' and it releases that
           // snapshot.  Test reading from the primary.
           const shard1Session =
      -        st.rs1.getPrimary().getDB("test1").getMongo().startSession({causalConsistency: false});
      +        st.configRS.getPrimary().getDB("test1").getMongo().startSession({causalConsistency: false});
           shard1Session.startTransaction({readConcern: {level: "snapshot", atClusterTime: clusterTime}});
           res = shard1Session.getDatabase("test1").runCommand({find: "coll1"});
           if (res.ok === 0) {
      diff --git a/src/mongo/db/read_concern_mongod.cpp b/src/mongo/db/read_concern_mongod.cpp
      index 0967968..f45a835 100644
      --- a/src/mongo/db/read_concern_mongod.cpp
      +++ b/src/mongo/db/read_concern_mongod.cpp
      @@ -134,6 +134,8 @@ Status makeNoopWriteIfNeeded(OperationContext* opCtx, LogicalTime clusterTime) {
               auto shardingState = ShardingState::get(opCtx);
               // standalone replica set, so there is no need to advance the OpLog on the primary.
               if (!shardingState->enabled()) {
      +            log() << "xxx not executing noop write to advance last applied op time because "
      +                     "sharding state not enabled";
                   return Status::OK();
               }
       
      diff --git a/src/mongo/db/repl/replication_coordinator_impl.cpp b/src/mongo/db/repl/replication_coordinator_impl.cpp
      index c66445f..f14fe0b 100644
      --- a/src/mongo/db/repl/replication_coordinator_impl.cpp
      +++ b/src/mongo/db/repl/replication_coordinator_impl.cpp
      @@ -1313,6 +1313,7 @@ Status ReplicationCoordinatorImpl::waitUntilOpTimeForReadUntil(OperationContext*
           }
       
           if (readConcern.getArgsAfterClusterTime() || readConcern.getArgsAtClusterTime()) {
      +        log() << "xxx waiting until cluster time for read";
               return _waitUntilClusterTimeForRead(opCtx, readConcern, deadline);
           } else {
               return _waitUntilOpTimeForReadDeprecated(opCtx, readConcern);
      diff --git a/src/mongo/s/catalog/replset_dist_lock_manager.cpp b/src/mongo/s/catalog/replset_dist_lock_manager.cpp
      index 3a960b6..b708b94 100644
      --- a/src/mongo/s/catalog/replset_dist_lock_manager.cpp
      +++ b/src/mongo/s/catalog/replset_dist_lock_manager.cpp
      @@ -87,9 +87,9 @@ ReplSetDistLockManager::ReplSetDistLockManager(ServiceContext* globalContext,
       ReplSetDistLockManager::~ReplSetDistLockManager() = default;
       
       void ReplSetDistLockManager::startUp() {
      -    if (!_execThread) {
      -        _execThread = stdx::make_unique<stdx::thread>(&ReplSetDistLockManager::doTask, this);
      -    }
      +    //    if (!_execThread) {
      +    //        _execThread = stdx::make_unique<stdx::thread>(&ReplSetDistLockManager::doTask, this);
      +    //    }
       }
       
       void ReplSetDistLockManager::shutDown(OperationContext* opCtx) {
      diff --git a/src/mongo/s/server.cpp b/src/mongo/s/server.cpp
      index 5558bed..3789aa1 100644
      --- a/src/mongo/s/server.cpp
      +++ b/src/mongo/s/server.cpp
      @@ -420,8 +420,8 @@ ExitCode runMongosServer(ServiceContext* serviceContext) {
       
           // Construct the sharding uptime reporter after the startup parameters have been parsed in order
           // to ensure that it picks up the server port instead of reporting the default value.
      -    shardingUptimeReporter.emplace();
      -    shardingUptimeReporter->startPeriodicThread();
      +    // shardingUptimeReporter.emplace();
      +    // shardingUptimeReporter->startPeriodicThread();
       
           clusterCursorCleanupJob.go();
       
      

      You will see the test hang after logging "xxx not executing noop write to advance last applied op time because sharding state not enabled", e.g.:

      [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.697-0500 c15525| 2019-01-04T17:31:13.696-0500 I SHARDING [conn73] Marking collection config.coll1 as collection version: <unsharded>
      [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.698-0500 s15528| 2019-01-04T17:31:13.698-0500 I COMMAND  [conn8] command config.coll1 appName: "MongoDB Shell" command: find { find: "coll1", filter: {}, lsid: { id: UUID("f05234e5-e178-4706-81bf-2d6bc810cd6a") }, $clusterTime: { clusterTime: Timestamp(1546641073, 12), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $db: "config" } nShards:1 cursorExhausted:1 numYields:0 nreturned:0 reslen:226 protocol:op_msg 7ms
      [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 
      [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 
      [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [jsTest] ----
      [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [jsTest] New session started with sessionID: { "id" : UUID("b9e7aa51-9d3d-4f98-94fa-8a333228a54d") }
      [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.776-0500 [jsTest] ----
      [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.776-0500 
      [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.776-0500 
      [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.795-0500 c15525| 2019-01-04T17:31:13.795-0500 I COMMAND  [conn1] xxx not executing noop write to advance last applied op time because sharding state not enabled
      [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.795-0500 c15525| 2019-01-04T17:31:13.795-0500 I REPL     [conn1] xxx waiting until cluster time for read
      
      Show
      Apply this patch, which turns off all the background threads (the periodic noop writer defaults to off for the noPassthrough suite for mongods started through resmoke.py) and modifies a jstest that tests that noop writes are done for snapshot readConcern against a shard to instead test against a config server: diff --git a/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js b/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js index b74832e..1ba4ec4 100644 --- a/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js +++ b/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js @@ -44,14 +44,14 @@ // Propagate 'clusterTime' to shard 1. This ensures that its next write will be at time >= // 'clusterTime'. - testDB1.coll1.find().itcount(); + st.s.getDB("config").coll1.find().itcount(); // Attempt a snapshot read at 'clusterTime' on shard 1. Test that it performs a noop write to // advance its lastApplied optime past 'clusterTime'. The snapshot read itself may fail if the // noop write advances the node's majority commit point past 'clusterTime' and it releases that // snapshot. Test reading from the primary. const shard1Session = - st.rs1.getPrimary().getDB("test1").getMongo().startSession({causalConsistency: false}); + st.configRS.getPrimary().getDB("test1").getMongo().startSession({causalConsistency: false}); shard1Session.startTransaction({readConcern: {level: "snapshot", atClusterTime: clusterTime}}); res = shard1Session.getDatabase("test1").runCommand({find: "coll1"}); if (res.ok === 0) { diff --git a/src/mongo/db/read_concern_mongod.cpp b/src/mongo/db/read_concern_mongod.cpp index 0967968..f45a835 100644 --- a/src/mongo/db/read_concern_mongod.cpp +++ b/src/mongo/db/read_concern_mongod.cpp @@ -134,6 +134,8 @@ Status makeNoopWriteIfNeeded(OperationContext* opCtx, LogicalTime clusterTime) { auto shardingState = ShardingState::get(opCtx); // standalone replica set, so there is no need to advance the OpLog on the primary. if (!shardingState->enabled()) { + log() << "xxx not executing noop write to advance last applied op time because " + "sharding state not enabled"; return Status::OK(); } diff --git a/src/mongo/db/repl/replication_coordinator_impl.cpp b/src/mongo/db/repl/replication_coordinator_impl.cpp index c66445f..f14fe0b 100644 --- a/src/mongo/db/repl/replication_coordinator_impl.cpp +++ b/src/mongo/db/repl/replication_coordinator_impl.cpp @@ -1313,6 +1313,7 @@ Status ReplicationCoordinatorImpl::waitUntilOpTimeForReadUntil(OperationContext* } if (readConcern.getArgsAfterClusterTime() || readConcern.getArgsAtClusterTime()) { + log() << "xxx waiting until cluster time for read"; return _waitUntilClusterTimeForRead(opCtx, readConcern, deadline); } else { return _waitUntilOpTimeForReadDeprecated(opCtx, readConcern); diff --git a/src/mongo/s/catalog/replset_dist_lock_manager.cpp b/src/mongo/s/catalog/replset_dist_lock_manager.cpp index 3a960b6..b708b94 100644 --- a/src/mongo/s/catalog/replset_dist_lock_manager.cpp +++ b/src/mongo/s/catalog/replset_dist_lock_manager.cpp @@ -87,9 +87,9 @@ ReplSetDistLockManager::ReplSetDistLockManager(ServiceContext* globalContext, ReplSetDistLockManager::~ReplSetDistLockManager() = default; void ReplSetDistLockManager::startUp() { - if (!_execThread) { - _execThread = stdx::make_unique<stdx::thread>(&ReplSetDistLockManager::doTask, this); - } + // if (!_execThread) { + // _execThread = stdx::make_unique<stdx::thread>(&ReplSetDistLockManager::doTask, this); + // } } void ReplSetDistLockManager::shutDown(OperationContext* opCtx) { diff --git a/src/mongo/s/server.cpp b/src/mongo/s/server.cpp index 5558bed..3789aa1 100644 --- a/src/mongo/s/server.cpp +++ b/src/mongo/s/server.cpp @@ -420,8 +420,8 @@ ExitCode runMongosServer(ServiceContext* serviceContext) { // Construct the sharding uptime reporter after the startup parameters have been parsed in order // to ensure that it picks up the server port instead of reporting the default value. - shardingUptimeReporter.emplace(); - shardingUptimeReporter->startPeriodicThread(); + // shardingUptimeReporter.emplace(); + // shardingUptimeReporter->startPeriodicThread(); clusterCursorCleanupJob.go(); You will see the test hang after logging "xxx not executing noop write to advance last applied op time because sharding state not enabled", e.g.: [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.697-0500 c15525| 2019-01-04T17:31:13.696-0500 I SHARDING [conn73] Marking collection config.coll1 as collection version: <unsharded> [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.698-0500 s15528| 2019-01-04T17:31:13.698-0500 I COMMAND [conn8] command config.coll1 appName: "MongoDB Shell" command: find { find: "coll1", filter: {}, lsid: { id: UUID("f05234e5-e178-4706-81bf-2d6bc810cd6a") }, $clusterTime: { clusterTime: Timestamp(1546641073, 12), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $db: "config" } nShards:1 cursorExhausted:1 numYields:0 nreturned:0 reslen:226 protocol:op_msg 7ms [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [jsTest] ---- [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [jsTest] New session started with sessionID: { "id" : UUID("b9e7aa51-9d3d-4f98-94fa-8a333228a54d") } [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.776-0500 [jsTest] ---- [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.776-0500 [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.776-0500 [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.795-0500 c15525| 2019-01-04T17:31:13.795-0500 I COMMAND [conn1] xxx not executing noop write to advance last applied op time because sharding state not enabled [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.795-0500 c15525| 2019-01-04T17:31:13.795-0500 I REPL [conn1] xxx waiting until cluster time for read
    • Sharding 2019-02-11, Sharding 2019-02-25
    • 18

      This issue affects:

      but it isn't as bad as it sounds, because there are several other background processes that will periodically cause writes against the CSRS anyway, such as

      It just means that afterClusterTime or atClusterTime requests against the config server may have high latency (several seconds).

      It can be fixed pretty easily by just checking serverGlobalParams.clusterRole instead of ShardingState::enabled.

            Assignee:
            blake.oler@mongodb.com Blake Oler
            Reporter:
            esha.maharishi@mongodb.com Esha Maharishi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: