PM-1632 added the possibility to run an update without a shard key in sharded clusters with all options. As part of the project SERVER-71133 added an optimization for findAndModify, trying to not go through the protocol if all the chunks are owned by a single shard. the collection has a single chunk.
However, the following scenario might happen:
- A transaction with snapshot read concern starts and performs a write to a collection at time T1. This effectively sets the atClusterTime of the entire transaction to T1.
- A moveChunk happens, changing the placement for collection2 at time T2.
- A findAndModify for collection2 is issued, the said optimization for PM-1632 will try to target the destination shard of the migration, with the correct shard version, but with the wrong clusterTime (T1),
This will cause the findAndModify to not find the document. You can find the repro attached. Until we can safely use the optimization, we could simply target using the default path.
A similar bug can be observed for updateOne and deleteOne, although for that path, the targeting is separated from the decision to use the single shard optimization so we always will broadcast to all of the correct shards instead of using the two phase write protocol.
- is caused by
-
SERVER-71133 Skip protocol if number of shards targeted is at most 1
- Closed
- is related to
-
SERVER-87197 Investigate error prone chunk manager functions usages
- Closed
-
SERVER-71133 Skip protocol if number of shards targeted is at most 1
- Closed
-
SERVER-76530 Support findAndModify remove on a sharded timeseries collection
- Closed
- related to
-
SERVER-88153 Bulk write without shard key using the single shard optimization may target documents incorrectly
- Closed
-
SERVER-88155 Timeseries update/delete without shard key using the single shard optimization may target incorrectly
- Closed