-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Sharding
-
Catalog and Routing
-
Fully Compatible
-
ALL
-
CAR Team 2024-04-01
SERVER-85441 added a new policy to the balancer to move unsharded collections using moveCollection. Internally, moveCollection uses the resharding infrastructure to perform an online movement of data. This means that now all suites that uses the balancer are randomly calling resharding, including suites that automatically run the CheckRoutingTableConsistency hook, which checks that every chunk has a matching collection in config.collections.
These two things are incompatible because the commit phase of resharding might temporary leave chunks without a collection in the commit phase, so the following interleaving might happen:
- The balancer issues a moveCollection
- The test finishes, starting the CheckRoutingTableConsistency hook
- CheckRoutingTableConsistency might check the sharding catalog before the commit phase of resharding finishes
Causing a false positive of metadata inconsistency failure. There is an initiative to use CheckMetadataConsistency instead (SERVER-76646) which actually serializes with DDL so the check is done in a steady state, however, it will require some work, and until is done, this false positive is going to cost time to developers investigating failures in their patches. We should add a temporary workaround by waiting for all resharding operation to finish before running the CheckRoutingTableConsistency checks.
- is caused by
-
SERVER-65035 Implement jstests hook checking chunks consistency
- Closed
- related to
-
SERVER-88620 [sharding_auto_bootstrap] Investigate why $currentOp causes a segfault on specific resharding js tests
- Closed
-
SERVER-85441 Extend `balancerShouldReturnRandomMigrations` failpoint to additionally move random unsharded collections
- Closed