Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 7.3.1, 8.0.0-rc0
Affects Version/s: None
Component/s: Sharding
Labels:
None

Assigned Teams:

Cluster Scalability
Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v7.3
Steps To Reproduce:
1. Run the attached repro
Sprint:
Cluster Scalability 2024-3-18, Cluster Scalability 2024-4-1
Linked BF Score:
126
Confidence Status:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

PM-1632 added the possibility to run an update without a shard key in sharded clusters with all options. As part of the project ~~SERVER-71133~~ added an optimization for findAndModify, trying to not go through the protocol if all the chunks are owned by a single shard. ~~the collection has a single chunk.~~

However, the following scenario might happen:

A transaction with snapshot read concern starts and performs a write to a collection at time T1. This effectively sets the atClusterTime of the entire transaction to T1.
A moveChunk happens, changing the placement for collection2 at time T2.
A findAndModify for collection2 is issued, the said optimization for PM-1632 will try to target the destination shard of the migration, with the correct shard version, but with the wrong clusterTime (T1),

This will cause the findAndModify to not find the document. You can find the repro attached. Until we can safely use the optimization, we could simply target using the default path.

A similar bug can be observed for updateOne and deleteOne, although for that path, the targeting is separated from the decision to use the single shard optimization so we always will broadcast to all of the correct shards instead of using the two phase write protocol.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

snapshot_txn_with_migration.js
1 kB
Feb 28 2024 12:35:00 PM UTC

is caused by

SERVER-71133 Skip protocol if number of shards targeted is at most 1

Closed

is related to

SERVER-87197 Investigate error prone chunk manager functions usages

Closed

SERVER-71133 Skip protocol if number of shards targeted is at most 1

Closed

SERVER-76530 Support findAndModify remove on a sharded timeseries collection

Closed

related to

SERVER-88153 Bulk write without shard key using the single shard optimization may target documents incorrectly

Closed

SERVER-88155 Timeseries update/delete without shard key using the single shard optimization may target incorrectly

Closed

(1 related to)

Assignee:: Jason Zhang (Inactive)

Reporter:: Marcos José Grillo Ramirez

Participants:: Githook User, Jason Zhang, Marcos José Grillo Ramirez

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Created:: Feb 28 2024 12:43:27 PM UTC

Updated:: Apr 30 2024 06:35:52 PM UTC

Resolved:: Mar 20 2024 02:38:40 AM UTC

Confidence Status Last Update:: 13/Mar/24 7:14 PM

GA Target Date:: None

Public Preview Target Date:: None

Private Preview Target Date:: None

Experiment Target Date:: None

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates