Replacement-style updates on a sharded cluster are targeted by mongoS on the basis of the replacement document's shard key values rather than the query component, with some relaxed constraints for updates whose queries contain an exact match on _id. Consider a collection sharded on _id, or _id plus any number of additional fields. Then, replacement-style updates of the form shown below are legal, despite the fact that the replacement document does not contain the entire shard key:
db.collection.update({_id: x}, {a: y, b: z})
This is because mongoD always automatically propagates the _id field of the existing document into the replacement document. The operation above will therefore succeed, assuming that the replacement document contains all other fields in the shard key and their values match those in the existing document.
Similarly, it should also be legal to perform the above operation with upsert:true, since mongoD will extract the _id from the query component when generating the new document to upsert.
However, there are a few shortcomings with the current targeting logic:
- The non-upsert replacement will succeed, but because the replacement document does not contain the entire shard key, the operation will be scattered to all shards that own chunks for the collection. This is unnecessary; the update should target a single shard.
- Because updates which target multiple endpoints are dispatched with ChunkVersion::IGNORED(), scattering will update any orphaned documents present in the cluster, as well as the cloned documents that are temporarily present on both the source and destination shards while a chunk migration is in flight. This leads to the unintuitive situation where a multi:false update with an exact match on _id returns nMatched and nModified greater than 1.
- Attempting this operation with upsert:true will fail, because the targeting logic requires an exact shard key match but only considers the replacement document. Since the _id is available in the request, this upsert should be permitted.
Finally, multi:true operations of the form shown above, or which target a range of _ids, will also succeed but must again scatter to all shards. These should target only the relevant subset of shards.
Note: after further discussion, it was determined that this improvement is not feasible at present. Targeting more than one shard obliges us to target all shards with unversioned updates.
To address these shortcomings, we should merge the replacement document into the query component as a set of additional constraints, and target on the basis of the resulting composite query.
- is duplicated by
-
SERVER-13010 Sharded upsert incorrectly errors if _id shard key not in replace spec
- Closed
- is related to
-
SERVER-30970 Don't allow single-updates that aren't targetted on the shard key
- Backlog