Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-70156

Allow updateLookup to succeed for updates which modify the shard key

    • Query Execution

      When a document's shard key is modified in such a way that the document remains on the same shard1, the corresponding oplog entry will be an update with the documentKey of the pre-image, rather than the post-image. This is necessary because if we were to record the post-image documentKey instead, there would be no robust way2 to identify which document was actually modified.

      However, one consequence of this behaviour is that if the user has requested updateLookup, it will use the pre-image documentKey, and will therefore fail to find the document. In the worst case, if the chunk containing the original pre-image shard key is moved off the shard and a new document which matches the pre-image documentKey is inserted on the new shard, updateLookup will return an unrelated document while the original document still exists in the cluster.

      To address this, we could record the post-image documentKey in the oplog as well as the pre-image key, or we could apply the update to the documentKey to produce the post-image key before performing the updateLookup. However, this would still not robustly fix the issue; at read-time, there is no guarantee that this particular update event is the only modification that has been made to the document's shard key between the time of the update and the current time.


      1 This issue does not manifest in cases where the shard-key modification causes the document to move from one shard to another, since that results in a delete on the original shard and an insert on the new shard.

      2 Technically, since this only occurs when the document stays on the same shard after its shard key is modified, it would be possible to identify the pre-image by _id alone - at least, as long as the document continues to remain on that shard. However, since the documentKey is considered the unique identifier of a particular document in all other cases, it is not reasonable to expect users to know that in this obscure scenario they should ignore the shard keys and consider only the _id.

            Assignee:
            backlog-query-execution [DO NOT USE] Backlog - Query Execution
            Reporter:
            bernard.gorman@mongodb.com Bernard Gorman
            Votes:
            1 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated: