-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.6.7
-
Component/s: Sharding
-
ALL
-
-
Sharding 2020-04-20
It seems the dataSize and collStats commands rely on the setShardVersion mechanism to be triggered by CRUD traffic coming through the same mongos node to flush the metadata, and will be incorrect until one of those happen, or a flushRouterConfig command is run.
This caught me out recently as a mongos node used exclusively by the DBAs didn't reflect changes made by the app team on the app server's mongos nodes. All the normal queries and updates were being done by those apps, whereas I was only doing diagnosis (no CRUD ops). For half a day I investigated an imaginary data imbalance issue the dataSize and collStat commands were showing me, only to find as soon as I ran a flushRouterConfig the issue was gone. I think doing a single CRUD op on the collection in question also resolves it.
The production event preceding the issue was the dropping of a big sharded collection and recreating it with a new shard key, but presumably mongos nodes will also be stale for other metadata changes such as chunk moves.
Issue was encountered in 3.6.7, but so far as I can see 4.0 code for these commands (and maybe all non-CRUD commands?) is using the catalog cache class in the same way, so I suspect it's still an issue for current release versions too.
- duplicates
-
SERVER-47436 Make shards validate shardKey in dataSize command
- Closed