-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.2.4
-
Component/s: None
-
None
-
ALL
Hi,
We recently upgraded our sharded cluster from MongoDB 3.0.9 to 3.2.4.
After upgrade we began to receive the following logs:
tail -10000 /var/log/mongodb/mongos.log | grep -v connection | grep -v Successfully | grep 'could not' 2016-04-04T16:02:42.780+0200 W SHARDING [conn110] could not autosplit collection stats.bidder_zone_bidprice_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.bidder_zone_bidprice_daily, received : 2|16486||5667717d46b7ddcd61ef5459, wanted : 2|16492||5667717d46b7ddcd61ef5459, recv ) 2016-04-04T16:02:54.490+0200 W SHARDING [conn110] could not autosplit collection stats.bidder_zone_bidprice_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.bidder_zone_bidprice_daily, received : 2|16486||5667717d46b7ddcd61ef5459, wanted : 2|16492||5667717d46b7ddcd61ef5459, recv ) 2016-04-04T16:04:52.905+0200 W SHARDING [conn110] could not autosplit collection stats.bidder_zone_bidprice_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.bidder_zone_bidprice_daily, received : 2|16486||5667717d46b7ddcd61ef5459, wanted : 2|16495||5667717d46b7ddcd61ef5459, recv ) 2016-04-04T16:05:18.003+0200 W SHARDING [conn127] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6217||5320f6b56789f4d11460c4ac, wanted : 11296|6223||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:05:20.129+0200 W SHARDING [conn127] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6217||5320f6b56789f4d11460c4ac, wanted : 11296|6226||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:06:19.624+0200 W SHARDING [conn127] splitChunk cmd { splitChunk: "stats.ad_bidder_zone_domain_daily", keyPattern: { _id: 1.0 }, min: { _id: { d: 20160404, bri: 62009, z: 116741, do: "jadethestone.deviantart.com" } }, max: { _id: { d: 20160404, bri: 70329, z: 302009, do: "www.flashscore.com" } }, from: "sfx4", splitKeys: [ { _id: { d: 20160404, bri: 63377, z: 9631, do: "guclassic.forumfree.it" } }, { _id: { d: 20160404, bri: 70329, z: 27653, do: "plateforme.francebillet.com" } } ], configdb: "csReplSet/172.16.18.2:27025,172.16.18.3:27025,172.16.18.4:27025", shardVersion: [ Timestamp 11296000|6217, ObjectId('5320f6b56789f4d11460c4ac') ], epoch: ObjectId('5320f6b56789f4d11460c4ac') } failed :: caused by :: UnknownError: could not acquire collection lock for stats.ad_bidder_zone_domain_daily to split chunk [{ _id: { d: 20160404, bri: 62009, z: 116741, do: "jadethestone.deviantart.com" } },{ _id: { d: 20160404, bri: 70329, z: 302009, do: "www.flashscore.com" } }) :: caused by :: LockBusy: timed out waiting for stats.ad_bidder_zone_domain_daily 2016-04-04T16:06:19.655+0200 W SHARDING [conn140] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6217||5320f6b56789f4d11460c4ac, wanted : 11296|6226||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:11:55.890+0200 W SHARDING [conn127] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6229||5320f6b56789f4d11460c4ac, wanted : 11296|6232||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:12:19.175+0200 W SHARDING [conn127] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6229||5320f6b56789f4d11460c4ac, wanted : 11296|6238||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:12:19.335+0200 W SHARDING [conn140] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6229||5320f6b56789f4d11460c4ac, wanted : 11296|6238||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:12:20.508+0200 W SHARDING [conn127] splitChunk cmd { splitChunk: "stats.ad_bidder_zone_domain_daily", keyPattern: { _id: 1.0 }, min: { _id: { d: 20160404, bri: 28569, z: 502409, do: "www.ligadasnerdices.com" } }, max: { _id: { d: 20160404, bri: 28569, z: 592273, do: "factpile.wikia.com" } }, from: "sfx4", splitKeys: [ { _id: { d: 20160404, bri: 28569, z: 512393, do: "osdir.com" } }, { _id: { d: 20160404, bri: 28569, z: 592113, do: "themagisterium.wikia.com" } } ], configdb: "csReplSet/172.16.18.2:27025,172.16.18.3:27025,172.16.18.4:27025", shardVersion: [ Timestamp 11296000|6229, ObjectId('5320f6b56789f4d11460c4ac') ], epoch: ObjectId('5320f6b56789f4d11460c4ac') } failed :: caused by :: UnknownError: could not acquire collection lock for stats.ad_bidder_zone_domain_daily to split chunk [{ _id: { d: 20160404, bri: 28569, z: 502409, do: "www.ligadasnerdices.com" } },{ _id: { d: 20160404, bri: 28569, z: 592273, do: "factpile.wikia.com" } }) :: caused by :: LockBusy: timed out waiting for stats.ad_bidder_zone_domain_daily 2016-04-04T16:12:20.531+0200 W SHARDING [conn140] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6229||5320f6b56789f4d11460c4ac, wanted : 11296|6238||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:21:23.233+0200 W SHARDING [conn2132] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6265||5320f6b56789f4d11460c4ac, wanted : 11296|6271||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:21:30.407+0200 W SHARDING [conn2132] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6265||5320f6b56789f4d11460c4ac, wanted : 11296|6271||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:21:30.508+0200 W SHARDING [conn127] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6265||5320f6b56789f4d11460c4ac, wanted : 11296|6271||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:21:44.999+0200 W SHARDING [conn112] could not autosplit collection stats.ad_bidder_zone_geo_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_geo_daily, received : 11647|11907||5320f5616789f4d11460c48f, wanted : 11647|11910||5320f5616789f4d11460c48f, recv ) 2016-04-04T16:22:04.153+0200 W SHARDING [conn2132] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6265||5320f6b56789f4d11460c4ac, wanted : 11296|6274||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:22:04.194+0200 W SHARDING [conn127] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6265||5320f6b56789f4d11460c4ac, wanted : 11296|6274||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:25:34.504+0200 W SHARDING [conn127] splitChunk cmd { splitChunk: "stats.ad_bidder_zone_domain_daily", keyPattern: { _id: 1.0 }, min: { _id: { d: 20160404, b: 62170713, bri: 13557, z: 521009, do: "pestleanalysis.com" } }, max: { _id: { d: 20160404, b: 64154633, bri: 56761, z: 547913, do: "sportgamesarena.com" } }, from: "sfx4", splitKeys: [ { _id: { d: 20160404, b: 62439569, bri: 19945, z: 502401, do: "blogdopaulinho.com.br" } }, { _id: { d: 20160404, b: 64146721, bri: 12461, z: 596561, do: "courierpostonline.com" } } ], configdb: "csReplSet/172.16.18.2:27025,172.16.18.3:27025,172.16.18.4:27025", shardVersion: [ Timestamp 11296000|6277, ObjectId('5320f6b56789f4d11460c4ac') ], epoch: ObjectId('5320f6b56789f4d11460c4ac') } failed :: caused by :: UnknownError: could not acquire collection lock for stats.ad_bidder_zone_domain_daily to split chunk [{ _id: { d: 20160404, b: 62170713, bri: 13557, z: 521009, do: "pestleanalysis.com" } },{ _id: { d: 20160404, b: 64154633, bri: 56761, z: 547913, do: "sportgamesarena.com" } }) :: caused by :: LockBusy: timed out waiting for stats.ad_bidder_zone_domain_daily 2016-04-04T16:29:48.031+0200 W SHARDING [conn157] could not autosplit collection stats.ad_bidder_zone_hourly :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_hourly, received : 9742|4708||5320f5e96789f4d11460c4a0, wanted : 9742|4711||5320f5e96789f4d11460c4a0, recv ) 2016-04-04T16:30:19.298+0200 W SHARDING [conn157] could not autosplit collection stats.ad_bidder_zone_hourly :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_hourly, received : 9742|4708||5320f5e96789f4d11460c4a0, wanted : 9742|4711||5320f5e96789f4d11460c4a0, recv ) 2016-04-04T16:44:54.682+0200 W SHARDING [conn112] could not autosplit collection stats.ad_bidder_zone_geo_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_geo_daily, received : 11647|11910||5320f5616789f4d11460c48f, wanted : 11647|11913||5320f5616789f4d11460c48f, recv ) 2016-04-04T16:46:24.526+0200 W SHARDING [conn127] could not autosplit collection stats.ad_bidder_zone_domain_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_domain_daily, received : 11296|6280||5320f6b56789f4d11460c4ac, wanted : 11296|6283||5320f6b56789f4d11460c4ac, recv ) 2016-04-04T16:50:42.542+0200 W SHARDING [conn110] could not autosplit collection stats.bidder_zone_bidprice_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.bidder_zone_bidprice_daily, received : 2|16498||5667717d46b7ddcd61ef5459, wanted : 2|16504||5667717d46b7ddcd61ef5459, recv ) 2016-04-04T16:51:09.584+0200 W SHARDING [conn110] could not autosplit collection stats.bidder_zone_bidprice_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.bidder_zone_bidprice_daily, received : 2|16498||5667717d46b7ddcd61ef5459, wanted : 2|16504||5667717d46b7ddcd61ef5459, recv ) 2016-04-04T16:59:26.821+0200 W SHARDING [conn110] could not autosplit collection stats.bidder_zone_bidprice_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.bidder_zone_bidprice_daily, received : 2|16498||5667717d46b7ddcd61ef5459, wanted : 2|16504||5667717d46b7ddcd61ef5459, recv ) 2016-04-04T17:06:56.056+0200 W SHARDING [conn110] could not autosplit collection stats.bidder_zone_bidprice_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.bidder_zone_bidprice_daily, received : 2|16498||5667717d46b7ddcd61ef5459, wanted : 2|16504||5667717d46b7ddcd61ef5459, recv ) 2016-04-04T17:14:16.222+0200 W SHARDING [conn110] could not autosplit collection stats.bidder_zone_bidprice_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.bidder_zone_bidprice_daily, received : 2|16498||5667717d46b7ddcd61ef5459, wanted : 2|16504||5667717d46b7ddcd61ef5459, recv ) 2016-04-04T17:21:00.166+0200 W SHARDING [conn2888] could not autosplit collection stats.ad_bidder_zone_geo_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_geo_daily, received : 11647|11913||5320f5616789f4d11460c48f, wanted : 11647|11916||5320f5616789f4d11460c48f, recv ) 2016-04-04T17:52:00.622+0200 W SHARDING [conn3351] could not autosplit collection stats.ad_bidder_zone_os_device_browser_daily :: caused by :: 9996 stale config in runCommand ( ns : stats.ad_bidder_zone_os_device_browser_daily, received : 2|10805||568b09697d1dfc98b73bb0a9, wanted : 2|10808||568b09697d1dfc98b73bb0a9, recv )
Apparently the cluster is still splitting chunks on those collections:
2016-04-04T16:25:35.050+0200 I SHARDING [conn140] autosplitted stats.ad_bidder_zone_domain_daily shard: ns: stats.ad_bidder_zone_domain_daily, shard: sfx4, lastmod: 11296|5961||5320f6b56789f4d11460c4ac, min: { _id: { d: 20160404, b: 62170713, bri: 13557, z: 521009, do: "pestleanalysis.com" } }, max: { _id: { d: 20160404, b: 64154633, bri: 56761, z: 547913, do: "sportgamesarena.com" } } into 3 (splitThreshold 67108864) 2016-04-04T16:30:40.460+0200 I SHARDING [conn115] autosplitted stats.ad_bidder_zone_os_device_browser_daily shard: ns: stats.ad_bidder_zone_os_device_browser_daily, shard: sfx3, lastmod: 2|10776||568b09697d1dfc98b73bb0a9, min: { _id: { d: 20160404, b: 78626265, bri: 1725, z: 455201, os: 11, dv: 3, bw: 7 } }, max: { _id: { d: 20160404, bri: 1, z: 441993, os: 1, dv: 5, bw: 6 } } into 3 (splitThreshold 67108864) 2016-04-04T16:56:31.615+0200 I SHARDING [conn115] autosplitted stats.ad_bidder_zone_os_device_browser_daily shard: ns: stats.ad_bidder_zone_os_device_browser_daily, shard: sfx3, lastmod: 2|10753||568b09697d1dfc98b73bb0a9, min: { _id: { d: 20160404, b: 64563537, bri: 1725, z: 259297, os: 11, dv: 3, bw: 2 } }, max: { _id: { d: 20160404, b: 68127697, bri: 1725, z: 232689, os: 11, dv: 3, bw: 2 } } into 3 (splitThreshold 67108864) 2016-04-04T16:57:09.535+0200 I SHARDING [conn157] autosplitted stats.ad_bidder_zone_hourly shard: ns: stats.ad_bidder_zone_hourly, shard: sfx3, lastmod: 9742|4709||5320f5e96789f4d11460c4a0, min: { _id: { d: 20160404, h: 15, bri: 70321, z: 162301 } }, max: { _id: { d: 20160404, h: 16, bri: 1727, z: 158997 } } into 3 (splitThreshold 67108864) 2016-04-04T17:02:25.257+0200 I SHARDING [conn115] autosplitted stats.ad_bidder_zone_os_device_browser_daily shard: ns: stats.ad_bidder_zone_os_device_browser_daily, shard: sfx3, lastmod: 2|10756||568b09697d1dfc98b73bb0a9, min: { _id: { d: 20160404, b: 71619265, bri: 19945, z: 592265, os: 1, dv: 6, bw: 1 } }, max: { _id: { d: 20160404, b: 72544481, bri: 14002, z: 223261, os: 5, dv: 3, bw: 2 } } into 3 (splitThreshold 67108864) 2016-04-04T17:14:59.445+0200 I SHARDING [conn110] autosplitted stats.bidder_zone_bidprice_daily shard: ns: stats.bidder_zone_bidprice_daily, shard: sfx3, lastmod: 2|16380||5667717d46b7ddcd61ef5459, min: { _id: { d: 20160331, bri: 12461, z: 599041, bp: 5978, bpa: 5978 } }, max: { _id: { d: 20160331, bri: 12461, z: 630385, bp: 3571, bpa: 3571 } } into 3 (splitThreshold 67108864) 2016-04-04T17:36:51.700+0200 I SHARDING [conn157] autosplitted stats.ad_bidder_zone_hourly shard: ns: stats.ad_bidder_zone_hourly, shard: sfx3, lastmod: 9742|4711||5320f5e96789f4d11460c4a0, min: { _id: { d: 20160404, h: 16, z: 524049 } }, max: { _id: { d: 20160404, h: 18, z: 571601 } } into 3 (splitThreshold 67108864) 2016-04-04T17:52:00.632+0200 I SHARDING [conn115] autosplitted stats.ad_bidder_zone_os_device_browser_daily shard: ns: stats.ad_bidder_zone_os_device_browser_daily, shard: sfx3, lastmod: 2|10758||568b09697d1dfc98b73bb0a9, min: { _id: { d: 20160404, b: 72573769, bri: 65457, z: 629049, os: 11, dv: 3, bw: 7 } }, max: { _id: { d: 20160404, b: 74390217, bri: 1725, z: 302025, os: 3, dv: 5, bw: 6 } } into 3 (splitThreshold 67108864) 2016-04-04T17:55:53.908+0200 I SHARDING [conn115] autosplitted stats.ad_bidder_zone_os_device_browser_daily shard: ns: stats.ad_bidder_zone_os_device_browser_daily, shard: sfx3, lastmod: 2|10767||568b09697d1dfc98b73bb0a9, min: { _id: { d: 20160404, b: 77959857, bri: 12461, z: 577777, os: 11, dv: 3, bw: 1 } }, max: { _id: { d: 20160404, b: 78267233, bri: 13882, z: 247809, os: 1, dv: 6, bw: 6 } } into 3 (splitThreshold 67108864)
Here are the dbhash of our 3 config servers:
csReplSet:SECONDARY> db.runCommand("dbhash") { "host" : "memc2.stickyadstv.com:27025", "collections" : { "changelog" : "3cbbaf9bb750e02cb757d04ac69c5240", "chunks" : "cf8efea3dcebb9cff0c6546bd60cfcb7", "collections" : "1a2f565119de536df8279924a76f90b5", "databases" : "50c593254f13a5e2038f99f7507fb2b0", "lockpings" : "cdc6f656a39b035831ae5313c1af22fb", "locks" : "d6a8ff43a5231157b72f2e56f831b3a2", "mongos" : "8a166896fed3a2b03900ba338d1edc33", "settings" : "78c94b9070ea57e631d947d88487dd72", "shards" : "f53143e7cf8d1bc6e8fccf532542c4eb", "tags" : "d41d8cd98f00b204e9800998ecf8427e", "version" : "e33c5713912909f8ace2ac7679e7fc5b" }, "md5" : "8f29423d646cd78402747a535f97b102", "timeMillis" : 376, "fromCache" : [ "config.collections", "config.databases", "config.settings", "config.shards", "config.tags", "config.version" ], "ok" : 1 } csReplSet:SECONDARY> csReplSet:SECONDARY> db.runCommand("dbhash") { "host" : "memc3.stickyadstv.com:27025", "collections" : { "changelog" : "3cbbaf9bb750e02cb757d04ac69c5240", "chunks" : "cf8efea3dcebb9cff0c6546bd60cfcb7", "collections" : "1a2f565119de536df8279924a76f90b5", "databases" : "50c593254f13a5e2038f99f7507fb2b0", "lockpings" : "cdc6f656a39b035831ae5313c1af22fb", "locks" : "d6a8ff43a5231157b72f2e56f831b3a2", "mongos" : "8a166896fed3a2b03900ba338d1edc33", "settings" : "78c94b9070ea57e631d947d88487dd72", "shards" : "f53143e7cf8d1bc6e8fccf532542c4eb", "tags" : "d41d8cd98f00b204e9800998ecf8427e", "version" : "e33c5713912909f8ace2ac7679e7fc5b" }, "md5" : "8f29423d646cd78402747a535f97b102", "timeMillis" : 469, "fromCache" : [ "config.collections", "config.databases", "config.settings", "config.shards", "config.tags", "config.version" ], "ok" : 1 } csReplSet:SECONDARY> csReplSet:PRIMARY> db.runCommand("dbhash") { "host" : "memc4.stickyadstv.com:27025", "collections" : { "changelog" : "3cbbaf9bb750e02cb757d04ac69c5240", "chunks" : "cf8efea3dcebb9cff0c6546bd60cfcb7", "collections" : "1a2f565119de536df8279924a76f90b5", "databases" : "50c593254f13a5e2038f99f7507fb2b0", "lockpings" : "cdc6f656a39b035831ae5313c1af22fb", "locks" : "d6a8ff43a5231157b72f2e56f831b3a2", "mongos" : "1b4899b626179bf4ded4f63ae270e710", "settings" : "78c94b9070ea57e631d947d88487dd72", "shards" : "f53143e7cf8d1bc6e8fccf532542c4eb", "tags" : "d41d8cd98f00b204e9800998ecf8427e", "version" : "e33c5713912909f8ace2ac7679e7fc5b" }, "md5" : "815a5adb064991f9781aea414e691a4b", "timeMillis" : 463, "fromCache" : [ "config.collections", "config.databases", "config.settings", "config.shards", "config.tags", "config.version" ], "ok" : 1 } csReplSet:PRIMARY>
MongoS Checksum differs on the Primary, but i don't think the issue is coming from this difference.
- We tried to run flushRouterconfig on the 2 mongoS instances we use.
- Then we stopped all write on the cluster and restarted every single component of the architecture (RSM, ConfigServs, MongoS servers)
After those steps we're still facing the same issue.
Do you know what's happening and how could we solve this please ?
Regards.
Anthony.
- duplicates
-
SERVER-22462 Autosplitting failure caused by stale config in runCommand
- Closed