-
Type: Question
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.4.4
-
Component/s: Networking, Replication, Sharding
-
None
-
Environment:Ubuntu
Sharded replica set
Over the last few months I've been getting this error, going through new versions didn't help. Last night I had it twice so I thought it's about time I posted something. Here is the log from the primary, during this window all querys throw an error.
Wed Sep 18 05:16:44.428 [conn701870] command admin.$cmd command:
{ writebacklisten: ObjectId('52302fdfc47aee5088985eb0') } ntoreturn:1 keyUpdates:0 reslen:44 300000ms
Wed Sep 18 05:17:21.627 [rsHealthPoll] DBClientCursor::init call() failed
Wed Sep 18 05:17:21.685 [rsHealthPoll] replSet info db5 is down (or slow to respond):
Wed Sep 18 05:17:21.686 [rsHealthPoll] replSet member db5 is now in state DOWN
Wed Sep 18 05:17:22.103 [rsHealthPoll] DBClientCursor::init call() failed
Wed Sep 18 05:17:22.103 [rsHealthPoll] replset info db9 heartbeat failed, retrying
Wed Sep 18 05:17:23.975 [ReplicaSetMonitorWatcher] Socket recv() timeout ip:port
Wed Sep 18 05:17:23.975 [ReplicaSetMonitorWatcher] SocketException: remote: ip:port error: 9001 socket exception [3] server [ip:port]
Wed Sep 18 05:17:23.976 [ReplicaSetMonitorWatcher] DBClientCursor::init call() failed
Wed Sep 18 05:17:25.193 [conn702234] command admin.$cmd command:
ntoreturn:1 keyUpdates:0 reslen:44 300000ms
Wed Sep 18 05:17:27.208 [ReplicaSetMonitorWatcher] trying reconnect to db8
Wed Sep 18 05:17:27.208 [rsHealthPoll] replset info db9 thinks that we are down
Wed Sep 18 05:17:27.208 [rsHealthPoll] replset info db5 thinks that we are down
Wed Sep 18 05:17:27.210 [rsHealthPoll] replSet member db5 is up
Wed Sep 18 05:17:27.211 [rsHealthPoll] replSet member db5 is now in state SECONDARY
Wed Sep 18 05:17:27.214 [ReplicaSetMonitorWatcher] reconnect db8 ok
Wed Sep 18 05:17:28.051 [conn702172] command admin.$cmd command:
ntoreturn:1 keyUpdates:0 reslen:44 300000ms
Wed Sep 18 05:17:29.212 [rsHealthPoll] replset info db5 thinks that we are down
Wed Sep 18 05:17:29.212 [rsHealthPoll] replset info db9 thinks that we are down
Wed Sep 18 05:17:29.212 [rsHealthPoll] replSet member db9 is now in state PRIMARY
Wed Sep 18 05:17:31.213 [rsHealthPoll] replSet member db9 is now in state SECONDARY
Wed Sep 18 05:17:31.893 [conn697777] command admin.$cmd command:
ntoreturn:1 keyUpdates:0 reslen:44 300000ms
Wed Sep 18 05:17:45.111 [conn619965] command admin.$cmd command:
ntoreturn:1 keyUpdates:0 reslen:44 300000ms