-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Sharding
-
Fully Compatible
-
ALL
-
v5.0
-
Sharding EMEA 2021-06-14
-
23
Requests sent by DDL coordinators to any shard (even itself) by using sendAuthenticatedCommandToShards are ending up in the ARS here where ShardRegistry::getShardNoReload is called, with no guarantee of retrieving updated info from the registry.
Objective of this ticket is to review all the usages of sendAuthenticatedCommandToShards in DDL coordinators in order to ensure that the shard registry is always initialized before any call.
In case of broadcasts, there is no problem because before ending up in the ARS there is always a call to getAllShardIds that internally triggers a reload if needed.
The problem is surely present in dropCollection and dropDatabase because the coordinator tries to contact the primary shard without any guarantee that the ShardRegistry is initialized.
Some possible solutions:
- Move the getAllShardIds calls before contacting the primary shard.
- Reload the shard registry on DDL coordinator construction (maybe just when resuming a DDL from disk?).
- causes
-
SERVER-66658 Shard registry might be accessed before initialization
- Closed
- is related to
-
SERVER-50206 Remove "NoReload" ShardRegistry lookup functions
- Blocked
- related to
-
SERVER-60916 CPS Restores failed with a snapshot with documents in reshardingOperation
- Closed
-
SERVER-61003 ReadConcernMajorityNotAvailableYet errors from ShardRegistry must be retried
- Closed