Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-15375

initShardVersion triggers inline RS refresh if no primary is available, creating additional latency for user queries

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.7.8
    • Affects Version/s: 2.6.4
    • Component/s: Replication, Sharding
    • None

      qIn sharded clusters with a read-only shard (imagine that the main DC is offline) new read queries with read preference "secondary" are trying to get the primary first to set the shard version. If primary is down, a replica set refresh is triggered in the function getHostOrRefresh. The problem is that its done inline with the user query, and if the primary is in TCP blackhole, user queries will experience additional latency.

      Here is the mongoS log (logLevel 5) with additional logging in getHostOrRefresh and a stack trace for getHostOrRefresh(primary_only):

      2014-09-24T14:14:40.735-0400 [conn1] Request::process begin ns: test.c msg id: 95 op: 2004 attempt: 0
      2014-09-24T14:14:40.736-0400 [conn1] query: test.c { query: {}, $readPreference: { mode: "secondary" } } ntoreturn: 0 options: 4
      2014-09-24T14:14:40.736-0400 [conn1] pcursor creating pcursor over QSpec { ns: "test.c", n2skip: 0, n2return: 0, options: 4, query: { query: {}, $readPreference: { mode: "secondary" } }, fields: {} } and CInfo { v_ns: "", filter: {} }
      2014-09-24T14:14:40.736-0400 [conn1] pcursor initializing over 1 shards required by [unsharded @ shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019]
      2014-09-24T14:14:40.736-0400 [conn1] pcursor initializing on shard shard01q:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019, current connection state is { state: {}, retryNext: false, init: false, finish: false, errored: false }
      2014-09-24T14:14:40.736-0400 [conn1] ReplicaSetMonitor::createIfNeeded shard01
      2014-09-24T14:14:40.736-0400 [conn1] ReplicaSetMonitor::get shard01
      2014-09-24T14:14:40.737-0400 [conn1] getHostOrRefresh: 4
      2014-09-24T14:14:40.737-0400 [conn1] ReplicaSetMonitor::get shard01
      2014-09-24T14:14:40.737-0400 [conn1] ReplicaSetMonitor::get shard01
      2014-09-24T14:14:40.737-0400 [conn1] getHostOrRefresh: 0
      2014-09-24T14:14:40.742-0400 [conn1] 0x100484bdb 0x100072c7c 0x100073000 0x100048e37 0x1003ca0d8 0x1003cce71 0x10039cb46 0x100026cd2 0x100028cca 0x1003a5c3e 0x1003a3522 0x1003a3738 0x1000552d8 0x10005a1db 0x10006471d 0x1003b2352 0x10039974a 0x100002c75 0x100448561 0x1004b9c45
       0   mongos                              0x0000000100484bdb _ZN5mongo15printStackTraceERSo + 43
       1   mongos                              0x0000000100072c7c _ZN5mongo17ReplicaSetMonitor16getHostOrRefreshERKNS_21ReadPreferenceSettingE + 244
       2   mongos                              0x0000000100073000 _ZN5mongo17ReplicaSetMonitor18getMasterOrUassertEv + 170
       3   mongos                              0x0000000100048e37 _ZN5mongo18DBClientReplicaSet11checkMasterEv + 93
       4   mongos                              0x00000001003ca0d8 _ZN5mongo14getVersionableEPNS_12DBClientBaseE + 1480
       5   mongos                              0x00000001003cce71 _ZN5mongo14VersionManager18initShardVersionCBEPNS_12DBClientBaseERNS_7BSONObjE + 49
       6   mongos                              0x000000010039cb46 _ZN5mongo22ShardingConnectionHook8onCreateEPNS_12DBClientBaseE + 758
       7   mongos                              0x0000000100026cd2 _ZN5mongo16DBConnectionPool13_finishCreateERKSsdPNS_12DBClientBaseE + 226
       8   mongos                              0x0000000100028cca _ZN5mongo16DBConnectionPool3getERKSsd + 998
       9   mongos                              0x00000001003a5c3e _ZN5mongo17ClientConnections3getERKSsS2_ + 100
       10  mongos                              0x00000001003a3522 _ZN5mongo15ShardConnection5_initEv + 126
       11  mongos                              0x00000001003a3738 _ZN5mongo15ShardConnectionC2ERKNS_5ShardERKSsN5boost10shared_ptrIKNS_12ChunkManagerEEE + 178
       12  mongos                              0x00000001000552d8 _ZN5mongo27ParallelSortClusteredCursor28setupVersionAndHandleSlaveOkEN5boost10shared_ptrINS_23ParallelConnectionStateEEERKNS_5ShardENS2_IS5_EERKNS_15NamespaceStringERKSsNS2_IKNS_12ChunkManagerEEE + 294
       13  mongos                              0x000000010005a1db _ZN5mongo27ParallelSortClusteredCursor9startInitEv + 6337
       14  mongos                              0x000000010006471d _ZN5mongo27ParallelSortClusteredCursor4initEv + 63
       15  mongos                              0x00000001003b2352 _ZN5mongo8Strategy7queryOpERNS_7RequestE + 2364
       16  mongos                              0x000000010039974a _ZN5mongo7Request7processEi + 1250
       17  mongos                              0x0000000100002c75 _ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE + 147
       18  mongos                              0x0000000100448561 _ZN5mongo17PortMessageServer17handleIncomingMsgEPv + 1681
       19  mongos                              0x00000001004b9c45 thread_proxy + 229
      2014-09-24T14:14:40.742-0400 [conn1] Starting new refresh of replica set shard01
      2014-09-24T14:14:40.742-0400 [conn1] creating new connection to:AD-MAC10G.local:27018
      2014-09-24T14:14:40.743-0400 [ConnectBG] BackgroundJob starting: ConnectBG
      2014-09-24T14:14:40.743-0400 [conn1] warning: Failed to connect to 10.4.100.44:27018, reason: errno:61 Connection refused
      2014-09-24T14:14:40.743-0400 [conn1] User Assertion: 13328:connection pool: connect failed AD-MAC10G.local:27018 : couldn't connect to server AD-MAC10G.local:27018 (10.4.100.44), connection attempt failed
      2014-09-24T14:14:40.743-0400 [conn1] polling for status of connection to 127.0.0.1:27019, no events
      2014-09-24T14:14:40.744-0400 [conn1] warning: No primary detected for set shard01
      2014-09-24T14:14:40.744-0400 [conn1] Updating host AD-MAC10G.local:27019 based on ismaster reply: { setName: "shard01", setVersion: 1, ismaster: false, secondary: true, hosts: [ "AD-MAC10G.local:27019", "AD-MAC10G.local:27018" ], me: "AD-MAC10G.local:27019", maxBsonObjectSize: 16777216, maxMessageSizeBytes: 48000000, maxWriteBatchSize: 1000, localTime: new Date(1411582480744), maxWireVersion: 2, minWireVersion: 0, ok: 1.0 }
      2014-09-24T14:14:40.744-0400 [conn1] User Assertion: 10009:ReplicaSetMonitor no master found for set: shard01
      2014-09-24T14:14:40.744-0400 [conn1] ReplicaSetMonitor::get shard01
      2014-09-24T14:14:40.744-0400 [conn1] ReplicaSetMonitor::get shard01
      2014-09-24T14:14:40.745-0400 [conn1] dbclient_rs say using secondary or tagged node selection in shard01, read pref is { pref: "secondary only", tags: [ {} ] } (primary : [not cached], lastTagged : [not cached])
      2014-09-24T14:14:40.745-0400 [conn1] selectNodeUsingTags: before _getMonitor(). Last used node: :27017
      2014-09-24T14:14:40.745-0400 [conn1] ReplicaSetMonitor::get shard01
      2014-09-24T14:14:40.745-0400 [conn1] getHostOrRefresh: 2
      2014-09-24T14:14:40.745-0400 [conn1] creating new connection to:AD-MAC10G.local:27019
      2014-09-24T14:14:40.746-0400 [ConnectBG] BackgroundJob starting: ConnectBG
      2014-09-24T14:14:40.746-0400 [conn1] connected to server AD-MAC10G.local:27019 (10.4.100.44)
      2014-09-24T14:14:40.746-0400 [conn1] connected connection!
      2014-09-24T14:14:40.746-0400 [conn1] dbclient_rs selecting node AD-MAC10G.local:27019
      2014-09-24T14:14:40.746-0400 [conn1] ReplicaSetMonitor::get shard01
      2014-09-24T14:14:40.746-0400 [conn1] pcursor initialized query (lazily) on shard shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019, current connection state is { state: { conn: "shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", vinfo: "shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false }
      2014-09-24T14:14:40.747-0400 [conn1] pcursor finishing over 1 shards
      2014-09-24T14:14:40.747-0400 [conn1] ReplicaSetMonitor::get shard01
      2014-09-24T14:14:40.747-0400 [conn1] pcursor finishing on shard shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019, current connection state is { state: { conn: "shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", vinfo: "shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false }
      2014-09-24T14:14:40.747-0400 [conn1] pcursor finished on shard shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019, current connection state is { state: { conn: "(done)", vinfo: "shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", cursor: { _id: ObjectId('5422075ce9c51acb5562df36') }, count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }
      2014-09-24T14:14:40.747-0400 [conn1] Request::process end ns: test.c msg id: 95 op: 2004 attempt: 0 11ms
      

      The easiest way to observe the issue is to use Windows (I used Win2008R2), because it has 1s latency for local connections to a non-listened port:

      1. Create a sharded cluster
        • 1 mongos
        • 1 shard - a 2-node RS
        • 1 config server
      2. Insert some test data from mongoS and check its accessibility
        db.c.insert({})
        rs.slaveOk()
        db.c.find()
        
      3. Shutdown one node in the shard, to make it read-only
      4. Try to fetch the data again from mongoS:
        set verbose true
        rs.slaveOk()
        db.c.find()
        

        Observe that it takes 2s for the first, and 1s for all the subsequent attempts to read the data.

            Assignee:
            greg_10gen Greg Studer
            Reporter:
            alex.komyagin@mongodb.com Alexander Komyagin (Inactive)
            Votes:
            2 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: