Deletion of a migrated chunk may be delayed if there are cursors open for that range. The following race condition may ensue:
1) migrate chunk C1 from TO to FROM shards
2) FROM shard delays deletion of C1 because of open cursor X
3) start migrate chunk C2 from TO to FROM shards
4) open cursor X of 2) finishes
5) FROM shard's cleanOldData thread finally kicks in for C1
6) FROM shard sees the deletions on C1 as 'mods', while cloning C2
7) TO shard applies deletes thinking they belong to C2
8) C1 is gone (but data is recoverable using the moveChunk/ data directory)
We'll fix both the FROM side, not to propagate the deletions on step 6, and the TO side, not to apply out of range deletions on step 7
- is related to
-
SERVER-13462 Ignore orphan documents in sharding_migrate_cursor1.js
- Closed