-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Catalog and Routing
-
Fully Compatible
-
ALL
-
5
The block_chunk_migrations_without_hashed_shard_key_index test migrates chunks between shards and finishes with the following sequence:
- wait for balancer round to finish
- wait until a chunk show up on the recipient shard
- check the shardVersion equivalence on CSRS and the donor shard
The checking might fail due to a race condition if a stepdown happens during the balancer round of the CSRS.
In that case the following happens:
- CSRS stepdown -> balancer round ends
- donor commands the recipient to update the local catalog (new chunks show up)
- donor commits the change on CSRS
- CSRS updates shardVersion
- check the shardVersion equivalence on CSRS and the donor shard
- CSRS responses the update -> donor has updated shardVersion
in this scenario the check happens just in the wrong time causing a fail in the test.
the intention was to eliminate this race condition with the wait for balancer round to finish but if a CSRS stepdown happens the race condition could occur.
Recommended solution:
Wait for the migrations to finish with the _shardsvrJoinMigrations internal command before the shardVersion checking (after the balancer is stopped) before