-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 4.4.6
-
Component/s: None
-
Fully Compatible
-
ALL
-
Sharding 2021-08-23
-
(copied to CRM)
We have an actual production case when the bootstrap DB got borked by obscure conditions, not excluding manual intervention. As the result the "shards" DB had a correct shard name, but the list of servers pointed to a completely different database's shard, not just non-existing servers.
So the RSM started contacting a completely unrelated cluster that had an actual shard running at the configured address. In theory RSM should check that the shard name matches, but it doesn't. Our code has logic to compare incoming shard name with expected, but this check is not enabled during initial bootstrap. This is the bug to fix.
How this happens? The StreamableReplicaSetMonitor constructor v4.4 creates the _sdamConfig with seed nodes but no shard name.
Next, this config is used by the state machine and TopologyStateMachine::updateRSWithoutPrimary() will hit the check when currentSetName is none, and will proceed with updating the topology description.
- related to
-
SERVER-59462 Investigate and fix in head if the RSM bootstrap from config server ignores the initial shard name
- Closed