-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.6.4
-
Component/s: Sharding
-
None
-
Fully Compatible
-
ALL
We are running 7 shards, each consisting of 3 replicaset members. We do pre-splitting. One of the shards is not accepting new chunks anymore, even if the chunk is empty.
The logs are saying that it's waiting for replication but all members are perfectly in sync. We've read that it might be a problem of local.slaves collection for version 2.2 and 2.4 but we are running v2.6.4 already. We dropped local.slaves collection neverthless but it did not help. We also stepped down the primary with no success. We also stopped one replSet member, removed its data, brought it up again, waited to be in sync, elected it as Primary but the chunkMove never succeeded.
What can we do to get this shard accepting new chunks again?
Here are the logs of the Primary of the destination shard grepped by "migrateThread":
2014-12-09T16:53:02.345+0100 [migrateThread] warning: migrate commit waiting for 2 slaves for 'offerStore.offer' { _id: 3739440290 } -> { _id: 3739940290 } waiting for: 54870f52:a9 2014-12-09T16:53:03.345+0100 [migrateThread] Waiting for replication to catch up before entering critical section 2014-12-09T16:53:04.345+0100 [migrateThread] Waiting for replication to catch up before entering critical section 2014-12-09T16:53:05.345+0100 [migrateThread] Waiting for replication to catch up before entering critical section 2014-12-09T16:53:06.345+0100 [migrateThread] Waiting for replication to catch up before entering critical section
This is the replication status of the replSet:
offerStoreDE2:SECONDARY> rs.status() { "set" : "offerStoreDE2", "date" : ISODate("2014-12-09T15:58:26Z"), "myState" : 2, "syncingTo" : "s131:27017", "members" : [ { "_id" : 3, "name" : "s136:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 6458100, "optime" : Timestamp(1418140706, 503), "optimeDate" : ISODate("2014-12-09T15:58:26Z"), "self" : true }, { "_id" : 4, "name" : "s131:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 1919333, "optime" : Timestamp(1418140706, 437), "optimeDate" : ISODate("2014-12-09T15:58:26Z"), "lastHeartbeat" : ISODate("2014-12-09T15:58:26Z"), "lastHeartbeatRecv" : ISODate("2014-12-09T15:58:25Z"), "pingMs" : 0, "syncingTo" : "s568:27017" }, { "_id" : 6, "name" : "s568:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 8893, "optime" : Timestamp(1418140706, 51), "optimeDate" : ISODate("2014-12-09T15:58:26Z"), "lastHeartbeat" : ISODate("2014-12-09T15:58:26Z"), "lastHeartbeatRecv" : ISODate("2014-12-09T15:58:26Z"), "pingMs" : 0, "electionTime" : Timestamp(1418137258, 1), "electionDate" : ISODate("2014-12-09T15:00:58Z") } ], "ok" : 1 }
- duplicates
-
SERVER-15849 Secondaries should not forward replication information for removed chained nodes
- Closed