Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-27547

Adjust the system time . then MongoDB status maintian RECOVERING

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.10
    • Component/s: Replication
    • None
    • ALL
    • Hide

      0、Replica set with 3 nodes
      1、shutdown all 3 nodes
      2、adjust the system time from 2017-03-14 to 2016-12-17 (date -s)
      3、start all 3 nodes
      4、one PRIMARY, other two nodes maintian RECOVERING

      Show
      0、Replica set with 3 nodes 1、shutdown all 3 nodes 2、adjust the system time from 2017-03-14 to 2016-12-17 (date -s) 3、start all 3 nodes 4、one PRIMARY, other two nodes maintian RECOVERING

      grep -v "[initandlisten]|[conn" log

      --------------------------------node 1-------------------------------------
      2017-03-14T14:41:57.607+0800 I STORAGE [signalProcessingThread] shutdown: removing fs lock...
      2017-03-14T14:41:57.608+0800 I CONTROL [signalProcessingThread] dbexit: rc: 0
      2016-12-17T12:50:25.846+0800 I CONTROL [main] ***** SERVER RESTARTED *****
      2016-12-17T12:50:26.921+0800 I NETWORK [HostnameCanonicalizationWorker] Starting hostname canonicalization worker
      2016-12-17T12:50:26.965+0800 I REPL [ReplicationExecutor] New replica set config in use: { _id: "fsp", version: 1, protocolVersion: 1, members: [ { _id: 0, host: "172.28.1.140:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "172.28.1.141:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "172.28.1.142:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults:

      { w: 1, wtimeout: 0 }

      , replicaSetId: ObjectId('58c552b12d4be615ca28e5b1') } }
      2016-12-17T12:50:26.965+0800 I REPL [ReplicationExecutor] This node is 172.28.1.140:27017 in the config
      2016-12-17T12:50:26.965+0800 I REPL [ReplicationExecutor] transition to STARTUP2
      2016-12-17T12:50:26.965+0800 I REPL [ReplicationExecutor] Starting replication applier threads
      2016-12-17T12:50:26.965+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.141:27017
      2016-12-17T12:50:26.965+0800 I REPL [ReplicationExecutor] transition to RECOVERING
      2016-12-17T12:50:26.965+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.142:27017
      2016-12-17T12:50:26.966+0800 I REPL [ReplicationExecutor] transition to SECONDARY
      2016-12-17T12:50:27.021+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 172.28.1.141:27017
      2016-12-17T12:50:27.021+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 172.28.1.142:27017
      2016-12-17T12:50:27.021+0800 I REPL [ReplicationExecutor] Member 172.28.1.141:27017 is now in state SECONDARY
      2016-12-17T12:50:27.022+0800 I REPL [ReplicationExecutor] Member 172.28.1.142:27017 is now in state PRIMARY
      2016-12-17T12:51:52.989+0800 I REPL [ReplicationExecutor] syncing from: 172.28.1.142:27017
      2016-12-17T12:51:53.013+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Connecting to 172.28.1.142:27017
      2016-12-17T12:51:53.013+0800 I REPL [SyncSourceFeedback] setting syncSourceFeedback to 172.28.1.142:27017
      2016-12-17T12:51:53.045+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Successfully connected to 172.28.1.142:27017
      2016-12-17T12:51:53.048+0800 I REPL [rsBackgroundSync] Starting rollback due to OplogStartMissing: our last op time fetched: (term: 11, timestamp: Mar 14 14:47:57:3c9d). source's GTE: (term: 12, timestamp: Mar 14 14:47:57:3c9d) hashes: (-1985836079535698320/4988593413792691282)
      2016-12-17T12:51:53.050+0800 I REPL [rsBackgroundSync] beginning rollback
      2016-12-17T12:51:53.050+0800 I REPL [rsBackgroundSync] rollback 0
      2016-12-17T12:51:53.050+0800 I REPL [ReplicationExecutor] transition to ROLLBACK
      2016-12-17T12:51:53.052+0800 I REPL [rsBackgroundSync] rollback 1
      2016-12-17T12:51:53.079+0800 I REPL [rsBackgroundSync] rollback 2 FindCommonPoint
      2016-12-17T12:51:53.080+0800 I REPL [rsBackgroundSync] rollback our last optime: Mar 14 14:47:57:3c9d
      2016-12-17T12:51:53.080+0800 I REPL [rsBackgroundSync] rollback their last optime: Mar 14 14:47:57:3d65
      2016-12-17T12:51:53.080+0800 I REPL [rsBackgroundSync] rollback diff in end of log times: 0 seconds
      2016-12-17T12:51:53.343+0800 I REPL [rsBackgroundSync] rollback 3 fixup
      2016-12-17T12:51:53.362+0800 I REPL [rsBackgroundSync] rollback 3.5
      2016-12-17T12:51:53.363+0800 I REPL [rsBackgroundSync] rollback 4 n:2
      2016-12-17T12:51:53.363+0800 I REPL [rsBackgroundSync] minvalid=(term: 12, timestamp: Mar 14 14:47:57:3d70)
      2016-12-17T12:51:53.364+0800 I REPL [rsBackgroundSync] rollback 4.6
      2016-12-17T12:51:53.364+0800 I REPL [rsBackgroundSync] rollback 4.7
      2016-12-17T12:51:53.375+0800 I REPL [rsBackgroundSync] rollback 5 d:12 u:11
      2016-12-17T12:51:53.376+0800 I REPL [rsBackgroundSync] rollback 6
      2016-12-17T12:51:53.376+0800 I REPL [rsBackgroundSync] rollback done
      2016-12-17T12:51:53.376+0800 I REPL [rsBackgroundSync] rollback finished
      2016-12-17T12:51:53.376+0800 I NETWORK [SyncSourceFeedback] Socket say send() errno:9 Bad file descriptor 172.28.1.142:27017
      2016-12-17T12:51:53.376+0800 I REPL [SyncSourceFeedback] SyncSourceFeedback error sending update: socket exception [SEND_ERROR] for 172.28.1.142:27017
      2016-12-17T12:51:53.376+0800 I REPL [SyncSourceFeedback] updateUpstream failed: Location9001: socket exception [SEND_ERROR] for 172.28.1.142:27017, will retry
      2016-12-17T12:51:53.387+0800 I REPL [ReplicationExecutor] transition to RECOVERING
      2016-12-17T12:51:53.388+0800 I REPL [ReplicationExecutor] syncing from: 172.28.1.141:27017
      2016-12-17T12:51:53.416+0800 I REPL [SyncSourceFeedback] setting syncSourceFeedback to 172.28.1.141:27017
      2016-12-17T12:51:53.416+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Connecting to 172.28.1.141:27017
      2016-12-17T12:51:53.446+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Successfully connected to 172.28.1.141:27017
      2016-12-17T12:51:57.039+0800 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to 172.28.1.141:27017
      2016-12-17T12:51:57.039+0800 I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
      2016-12-17T12:51:57.040+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.141:27017
      2016-12-17T12:51:57.076+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 172.28.1.141:27017
      2016-12-17T12:51:57.077+0800 I REPL [ReplicationExecutor] Member 172.28.1.141:27017 is now in state RECOVERING
      2016-12-17T15:13:45.705+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Connecting to 172.28.1.141:27017

      --------------------------------node 2-------------------------------------
      2017-03-14T14:41:50.897+0800 I CONTROL [signalProcessingThread] now exiting
      2017-03-14T14:41:50.897+0800 I NETWORK [signalProcessingThread] shutdown: going to close listening sockets...
      2017-03-14T14:41:50.897+0800 I NETWORK [signalProcessingThread] closing listening socket: 7
      2017-03-14T14:41:50.897+0800 I NETWORK [signalProcessingThread] shutdown: going to flush diaglog...
      2017-03-14T14:41:50.897+0800 I NETWORK [signalProcessingThread] shutdown: going to close sockets...
      2017-03-14T14:41:50.897+0800 I STORAGE [signalProcessingThread] WiredTigerKVEngine shutting down
      2017-03-14T14:41:50.913+0800 I STORAGE [signalProcessingThread] shutdown: removing fs lock...
      2017-03-14T14:41:50.913+0800 I CONTROL [signalProcessingThread] dbexit: rc: 0
      2016-12-17T12:50:12.767+0800 I CONTROL [main] ***** SERVER RESTARTED *****
      2016-12-17T12:50:13.573+0800 I NETWORK [HostnameCanonicalizationWorker] Starting hostname canonicalization worker
      2016-12-17T12:50:16.580+0800 W NETWORK [ReplicationExecutor] Failed to connect to 172.28.1.140:27017, reason: errno:113 No route to host
      2016-12-17T12:50:16.599+0800 I REPL [ReplicationExecutor] New replica set config in use: { _id: "fsp", version: 1, protocolVersion: 1, members: [ { _id: 0, host: "172.28.1.140:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "172.28.1.141:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "172.28.1.142:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults:

      { w: 1, wtimeout: 0 }

      , replicaSetId: ObjectId('58c552b12d4be615ca28e5b1') } }
      2016-12-17T12:50:16.599+0800 I REPL [ReplicationExecutor] This node is 172.28.1.141:27017 in the config
      2016-12-17T12:50:16.599+0800 I REPL [ReplicationExecutor] transition to STARTUP2
      2016-12-17T12:50:16.599+0800 I REPL [ReplicationExecutor] Starting replication applier threads
      2016-12-17T12:50:16.600+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:50:16.600+0800 I REPL [ReplicationExecutor] transition to RECOVERING
      2016-12-17T12:50:16.600+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.142:27017
      2016-12-17T12:50:16.601+0800 I COMMAND [ftdc] serverStatus was very slow:

      { after basic: 0, after asserts: 0, after connections: 0, after extra_info: 0, after globalLock: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersRepl: 0, after repl: 2580, after storageEngine: 2580, after tcmalloc: 2580, after wiredTiger: 2590, at end: 2590 }

      2016-12-17T12:50:16.602+0800 I REPL [ReplicationExecutor] transition to SECONDARY
      2016-12-17T12:50:16.994+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 172.28.1.142:27017
      2016-12-17T12:50:16.998+0800 I REPL [ReplicationExecutor] Member 172.28.1.142:27017 is now in state SECONDARY
      2016-12-17T12:50:19.586+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to 172.28.1.140:27017 - HostUnreachable: No route to host
      2016-12-17T12:50:19.587+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 172.28.1.140:27017; HostUnreachable: No route to host
      2016-12-17T12:50:19.587+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:50:22.592+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to 172.28.1.140:27017 - HostUnreachable: No route to host
      2016-12-17T12:50:22.593+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 172.28.1.140:27017; HostUnreachable: No route to host
      2016-12-17T12:50:22.593+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:50:25.599+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to 172.28.1.140:27017 - HostUnreachable: No route to host
      2016-12-17T12:50:25.599+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 172.28.1.140:27017; HostUnreachable: No route to host
      2016-12-17T12:50:30.599+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:50:30.631+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 172.28.1.140:27017
      2016-12-17T12:50:30.632+0800 I REPL [ReplicationExecutor] Member 172.28.1.140:27017 is now in state SECONDARY
      2016-12-17T12:50:31.614+0800 I REPL [ReplicationExecutor] syncing from: 172.28.1.140:27017
      2016-12-17T12:50:31.648+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:50:31.648+0800 I REPL [SyncSourceFeedback] setting syncSourceFeedback to 172.28.1.140:27017
      2016-12-17T12:50:31.677+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Successfully connected to 172.28.1.140:27017
      2016-12-17T12:50:32.001+0800 I REPL [ReplicationExecutor] Member 172.28.1.142:27017 is now in state PRIMARY
      2016-12-17T12:50:36.684+0800 I REPL [ReplicationExecutor] could not find member to sync from
      2016-12-17T12:51:52.704+0800 I REPL [ReplicationExecutor] syncing from: 172.28.1.142:27017
      2016-12-17T12:51:52.729+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Connecting to 172.28.1.142:27017
      2016-12-17T12:51:52.730+0800 I REPL [SyncSourceFeedback] setting syncSourceFeedback to 172.28.1.142:27017
      2016-12-17T12:51:52.752+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Successfully connected to 172.28.1.142:27017
      2016-12-17T12:51:52.754+0800 I REPL [rsBackgroundSync] Starting rollback due to OplogStartMissing: our last op time fetched: (term: 11, timestamp: Mar 14 14:47:57:3c9d). source's GTE: (term: 12, timestamp: Mar 14 14:47:57:3c9d) hashes: (-1985836079535698320/4988593413792691282)
      2016-12-17T12:51:52.754+0800 I REPL [rsBackgroundSync] beginning rollback
      2016-12-17T12:51:52.754+0800 I REPL [rsBackgroundSync] rollback 0
      2016-12-17T12:51:52.755+0800 I REPL [ReplicationExecutor] transition to ROLLBACK
      2016-12-17T12:51:52.757+0800 I REPL [rsBackgroundSync] rollback 1
      2016-12-17T12:51:52.784+0800 I REPL [rsBackgroundSync] rollback 2 FindCommonPoint
      2016-12-17T12:51:52.785+0800 I REPL [rsBackgroundSync] rollback our last optime: Mar 14 14:47:57:3c9d
      2016-12-17T12:51:52.785+0800 I REPL [rsBackgroundSync] rollback their last optime: Mar 14 14:47:57:3d5e
      2016-12-17T12:51:52.785+0800 I REPL [rsBackgroundSync] rollback diff in end of log times: 0 seconds
      2016-12-17T12:51:53.111+0800 I REPL [rsBackgroundSync] rollback 3 fixup
      2016-12-17T12:51:53.124+0800 I REPL [rsBackgroundSync] rollback 3.5
      2016-12-17T12:51:53.125+0800 I REPL [rsBackgroundSync] rollback 4 n:2
      2016-12-17T12:51:53.125+0800 I REPL [rsBackgroundSync] minvalid=(term: 12, timestamp: Mar 14 14:47:57:3d68)
      2016-12-17T12:51:53.125+0800 I REPL [rsBackgroundSync] rollback 4.6
      2016-12-17T12:51:53.125+0800 I REPL [rsBackgroundSync] rollback 4.7
      2016-12-17T12:51:53.129+0800 I REPL [rsBackgroundSync] rollback 5 d:12 u:11
      2016-12-17T12:51:53.129+0800 I REPL [rsBackgroundSync] rollback 6
      2016-12-17T12:51:53.129+0800 I REPL [rsBackgroundSync] rollback done
      2016-12-17T12:51:53.129+0800 I NETWORK [SyncSourceFeedback] Socket say send() errno:9 Bad file descriptor 172.28.1.142:27017
      2016-12-17T12:51:53.129+0800 I REPL [rsBackgroundSync] rollback finished
      2016-12-17T12:51:53.129+0800 I REPL [SyncSourceFeedback] SyncSourceFeedback error sending update: socket exception [SEND_ERROR] for 172.28.1.142:27017
      2016-12-17T12:51:53.130+0800 I REPL [SyncSourceFeedback] updateUpstream failed: Location9001: socket exception [SEND_ERROR] for 172.28.1.142:27017, will retry
      2016-12-17T12:51:53.139+0800 I REPL [ReplicationExecutor] transition to RECOVERING
      2016-12-17T12:51:53.139+0800 I REPL [ReplicationExecutor] syncing from: 172.28.1.140:27017
      2016-12-17T12:51:53.164+0800 E REPL [rsBackgroundSync] sync producer problem: 13436 cannot read from oplog collection while in rollback
      2016-12-17T12:51:53.265+0800 I REPL [ReplicationExecutor] syncing from: 172.28.1.140:27017
      2016-12-17T12:51:53.286+0800 E REPL [rsBackgroundSync] sync producer problem: 13436 cannot read from oplog collection while in rollback
      2016-12-17T12:51:53.386+0800 I REPL [ReplicationExecutor] syncing from: 172.28.1.140:27017
      2016-12-17T12:51:53.414+0800 I REPL [SyncSourceFeedback] setting syncSourceFeedback to 172.28.1.140:27017
      2016-12-17T12:51:53.414+0800 I ASIO [rsBackgroundSync] dropping unhealthy pooled connection to 172.28.1.140:27017
      2016-12-17T12:51:53.414+0800 I ASIO [rsBackgroundSync] after drop, pool was empty, going to spawn some connections
      2016-12-17T12:51:53.414+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:51:53.448+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Successfully connected to 172.28.1.140:27017
      2016-12-17T12:51:56.702+0800 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to 172.28.1.140:27017
      2016-12-17T12:51:56.702+0800 I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
      2016-12-17T12:51:56.703+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:51:56.723+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 172.28.1.140:27017
      2016-12-17T12:51:56.723+0800 I REPL [ReplicationExecutor] Member 172.28.1.140:27017 is now in state RECOVERING
      2016-12-17T13:38:09.036+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Connecting to 172.28.1.140:27017
      2016-12-17T13:38:09.059+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Successfully connected to 172.28.1.140:27017
      2016-12-17T13:42:19.087+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Connecting to 172.28.1.140:27017
      2016-12-17T13:42:19.108+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Successfully connected to 172.28.1.140:27017
      2016-12-17T16:52:06.754+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Connecting to 172.28.1.140:27017
      2016-12-17T16:52:06.781+0800 I ASIO [NetworkInterfaceASIO-BGSync-0] Successfully connected to 172.28.1.140:27017
      2016-12-17T19:38:02.677+0800 I REPL [SyncSourceFeedback] SyncSourceFeedback error sending update: network error while attempting to run command 'replSetUpdatePosition' on host '172.28.1.140:27017'
      2016-12-17T19:38:02.677+0800 I REPL [SyncSourceFeedback] updateUpstream failed: HostUnreachable: network error while attempting to run command 'replSetUpdatePosition' on host '172.28.1.140:27017' , will retry
      2016-12-17T19:38:02.919+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 172.28.1.140:27017; HostUnreachable: End of file
      2016-12-17T19:38:02.919+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017

      --------------------------------node 3-------------------------------------
      2017-03-14T14:41:50.894+0800 W EXECUTOR [rsBackgroundSync] killCursors command task failed: CallbackCanceled: Callback canceled
      2017-03-14T14:41:50.895+0800 I CONTROL [signalProcessingThread] now exiting
      2017-03-14T14:41:50.895+0800 I NETWORK [signalProcessingThread] shutdown: going to close listening sockets...
      2017-03-14T14:41:50.895+0800 I NETWORK [signalProcessingThread] closing listening socket: 7
      2017-03-14T14:41:50.895+0800 I NETWORK [signalProcessingThread] shutdown: going to flush diaglog...
      2017-03-14T14:41:50.895+0800 I NETWORK [signalProcessingThread] shutdown: going to close sockets...
      2017-03-14T14:41:50.895+0800 I STORAGE [signalProcessingThread] WiredTigerKVEngine shutting down
      2017-03-14T14:41:51.034+0800 I STORAGE [signalProcessingThread] shutdown: removing fs lock...
      2017-03-14T14:41:51.034+0800 I CONTROL [signalProcessingThread] dbexit: rc: 0
      2016-12-17T12:50:13.156+0800 I CONTROL [main] ***** SERVER RESTARTED *****
      2016-12-17T12:50:13.945+0800 I NETWORK [HostnameCanonicalizationWorker] Starting hostname canonicalization worker
      2016-12-17T12:50:16.951+0800 W NETWORK [ReplicationExecutor] Failed to connect to 172.28.1.140:27017, reason: errno:113 No route to host
      2016-12-17T12:50:16.974+0800 I REPL [ReplicationExecutor] New replica set config in use: { _id: "fsp", version: 1, protocolVersion: 1, members: [ { _id: 0, host: "172.28.1.140:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "172.28.1.141:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "172.28.1.142:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults:

      { w: 1, wtimeout: 0 }

      , replicaSetId: ObjectId('58c552b12d4be615ca28e5b1') } }
      2016-12-17T12:50:16.974+0800 I REPL [ReplicationExecutor] This node is 172.28.1.142:27017 in the config
      2016-12-17T12:50:16.974+0800 I REPL [ReplicationExecutor] transition to STARTUP2
      2016-12-17T12:50:16.974+0800 I REPL [ReplicationExecutor] Starting replication applier threads
      2016-12-17T12:50:16.974+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:50:16.975+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.141:27017
      2016-12-17T12:50:16.975+0800 I REPL [ReplicationExecutor] transition to RECOVERING
      2016-12-17T12:50:16.976+0800 I REPL [ReplicationExecutor] transition to SECONDARY
      2016-12-17T12:50:16.977+0800 I COMMAND [ftdc] serverStatus was very slow:

      { after basic: 0, after asserts: 0, after connections: 0, after extra_info: 0, after globalLock: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersRepl: 0, after repl: 2958, after storageEngine: 2958, after tcmalloc: 2958, after wiredTiger: 2958, at end: 2958 }

      2016-12-17T12:50:17.003+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 172.28.1.141:27017
      2016-12-17T12:50:17.003+0800 I REPL [ReplicationExecutor] Member 172.28.1.141:27017 is now in state SECONDARY
      2016-12-17T12:50:19.957+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to 172.28.1.140:27017 - HostUnreachable: No route to host
      2016-12-17T12:50:19.957+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 172.28.1.140:27017; HostUnreachable: No route to host
      2016-12-17T12:50:19.957+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:50:22.963+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to 172.28.1.140:27017 - HostUnreachable: No route to host
      2016-12-17T12:50:22.963+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 172.28.1.140:27017; HostUnreachable: No route to host
      2016-12-17T12:50:22.963+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:50:25.969+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to 172.28.1.140:27017 - HostUnreachable: No route to host
      2016-12-17T12:50:25.970+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 172.28.1.140:27017; HostUnreachable: No route to host
      2016-12-17T12:50:27.019+0800 I REPL [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms
      2016-12-17T12:50:27.019+0800 I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected
      2016-12-17T12:50:27.019+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:50:27.020+0800 I REPL [ReplicationExecutor] dry election run succeeded, running for election
      2016-12-17T12:50:27.021+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:50:27.021+0800 I REPL [ReplicationExecutor] election succeeded, assuming primary role in term 12
      2016-12-17T12:50:27.021+0800 I REPL [ReplicationExecutor] transition to PRIMARY
      2016-12-17T12:50:27.021+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:50:27.070+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 172.28.1.140:27017
      2016-12-17T12:50:27.070+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 172.28.1.140:27017
      2016-12-17T12:50:27.071+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 172.28.1.140:27017
      2016-12-17T12:50:27.075+0800 I REPL [ReplicationExecutor] Member 172.28.1.140:27017 is now in state SECONDARY
      2016-12-17T12:50:27.980+0800 I REPL [rsSync] transition to primary complete; database writes are now permitted
      2016-12-17T12:51:53.053+0800 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to 172.28.1.141:27017
      2016-12-17T12:51:53.053+0800 I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
      2016-12-17T12:51:53.053+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.141:27017
      2016-12-17T12:51:53.081+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 172.28.1.141:27017
      2016-12-17T12:51:53.082+0800 I REPL [ReplicationExecutor] Member 172.28.1.141:27017 is now in state ROLLBACK
      2016-12-17T12:51:53.114+0800 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to 172.28.1.140:27017
      2016-12-17T12:51:53.114+0800 I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
      2016-12-17T12:51:53.114+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T12:51:53.138+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 172.28.1.140:27017
      2016-12-17T12:51:53.142+0800 I REPL [ReplicationExecutor] Member 172.28.1.140:27017 is now in state ROLLBACK
      2016-12-17T12:51:55.083+0800 I REPL [ReplicationExecutor] Member 172.28.1.141:27017 is now in state RECOVERING
      2016-12-17T12:51:55.143+0800 I REPL [ReplicationExecutor] Member 172.28.1.140:27017 is now in state RECOVERING
      2016-12-17T19:38:02.677+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 172.28.1.140:27017; HostUnreachable: End of file
      2016-12-17T19:38:02.678+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.28.1.140:27017
      2016-12-17T19:38:02.679+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to 172.28.1.140:27017 - HostUnreachable: End of file
      2016-12-17T19:38:02.679+0800 I ASIO [NetworkInterfaceASIO-Replication-0] failed to close stream: Transport endpoint is not connected
      2016-12-17T19:38:02.679+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 172.28.1.140:27017; HostUnreachable: End of file

            Assignee:
            spencer@mongodb.com Spencer Brody (Inactive)
            Reporter:
            eveninger eveninger
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: