If we change a document's shard key such that the document will have to change shards, we could end up with a duplicate key error on _id due to an orphaned version of that document existing on that shard. Other legitimate DuplicateKeyErrors could occur (for example, if there's a unique index on the shard key), in which case we'll throw an ordinary DuplicateKeyError. This ticket only addresses _id conflicts.
Consider the following scenario:
1) A document x is migrated from shard A to shard B. Suppose the RangeDeleter does not run yet, and the orphaned document x remains on shard A.
2) An update is issued to document x (residing on shard B) such that it requires moving that document back to shard A. The update operation is converted into a delete from shard B and an insert into shard A.
3) The insert operation into shard A fails with a duplicate key error on _id, because the orphaned version of x still exists on shard A.
We should make sure this case leads to an error message that's more meaningful to the user than DuplicateKeyError (something indicated it's related to orphaned documents), and perhaps with a link to documentation.
- is related to
-
SERVER-40815 Updating the shard key can conflict with in-progress migrations
- Backlog