Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 8.1.0-rc0, 8.0.0-rc2
Affects Version/s: None
Component/s: Sharding
Labels:

Assigned Teams:

Cluster Scalability
Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v8.0
Sprint:
Cluster Scalability 2024-1-22, Cluster Scalability 2024-2-5, Cluster Scalability 2024-2-19, Cluster Scalability 2024-3-4, Cluster Scalability 2024-3-18, Cluster Scalability 2024-4-1, Cluster Scalability 2024-4-15, Cluster Scalability 2024-4-29
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

An upsert that targets a different shard than will own the inserted document triggers the WouldChangeOwningShard protocol, which spawns a transaction that in this case would do a noop update on the initial shard and an insert on the owning shard. This will trigger the single write shard optimization (a noop update is considered a read), so the read shard will not have a durable transaction record and may "forget" the transaction. If the client retries their update and an intervening writer inserted a matching document on the read shard, the retry will update that matching document, instead of triggering the WCOS protocol, breaking the semantics of a retryable write.

Updates that change a shard key value have never been fully retryable by design, but before the single write shard optimization was enabled in ~~SERVER-48340~~, they required two phase commit, which by implementation happens to write a transaction record on the "read" shard, which guarantees it throws an error on a retry even if there is an intervening write. When full retryability is added for such updates, this problem should go away, but in the meantime, to maximize safety we should disable the single write shard optimization for writes that trigger the WCOS protocol.

is related to

SERVER-102481 Unset disallowSingleWriteShardCommit when resetting transaction router state

Closed

related to

SERVER-89032 TODO: Investigate WCOS error handling path for time series updates

Backlog

SERVER-48340 Re-enable single-write-shard transaction commit optimization

Closed

Assignee:: Abdul Qadeer
Reporter:: Jack Mulrow
Participants:: Abdul Qadeer, Githook User, Jack Mulrow
Votes:: 0 Vote for this issue
Watchers:: 8 Start watching this issue

Created:: Jan 11 2024 09:51:35 PM UTC
Updated:: May 08 2025 02:09:46 PM UTC
Resolved:: Apr 18 2024 04:05:23 PM UTC
Confidence Status Last Update:: 26/Mar/24 1:29 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates