Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.1.4
Affects Version/s: 4.1.1
Component/s: Sharding
Labels:
- ShardedTxn:RouterSupport

Backwards Compatibility:
Fully Compatible
Sprint:
Sharding 2018-08-13, Sharding 2018-08-27, Sharding 2018-09-10, Sharding 2018-09-24
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Original Description

Single replica-set transactions currently abort transactions unconditionally when an exception occurs. The reason is that the transaction could be holding to a WUOW which needs to get cleaned up. The abort happens here.

This is problematic for some of the sharding machinery that uses exceptions as control flow. Examples include StaleConfigException, CommandOnShardedViewNotSupportedOnMongod, SnapshotTooOld (this makes readConcern: snapshot effectively unusable).

New Description

~~SERVER-36591~~ handles retries on snapshot errors, so this ticket will track retries on all other re-targeting errors, i.e. CommandOnShardedViewNotSupportedOnMongod, StaleConfigException, and CannotImplicitlyCreateCollection.

Mongos should be allowed to retry on each of these errors, picking a new atClusterTime only during the first overall statement in the transaction, otherwise using the immutable atClusterTime established during the first statement. Any shards newly added by this statement must include startTransaction=true on its retries, not just the first request sent to them. If mongos exhausts its allowed retry attempts and any of these errors is returned to the client, the response should include the TransientTransactionError label.

Shards can also be modified to only abort their local transaction on these errors if they are encountered on the first statement that shard has seen.

has to be done after

SERVER-36590 Allow shards to start new transactions at the active transaction number

Closed

has to be done before

SERVER-36312 Re-enable atClusterTime selection algorithm on mongos

Closed

related to

SERVER-37207 Only retry failed writes in a batch on stale version errors in a transaction

Backlog

SERVER-37209 Allow mongos to retry on view resolution errors in a transaction

Closed

Assignee:: Jack Mulrow
Reporter:: Randolph Tan
Participants:: Githook User, Jack Mulrow, Randolph Tan
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Jun 20 2018 06:16:47 PM UTC
Updated:: Oct 29 2023 10:30:34 PM UTC
Resolved:: Sep 19 2018 07:16:20 PM UTC
Confidence Status Last Update:: 13/Aug/18 2:45 PM

Details

Description

Original Description

New Description

Attachments

Issue Links

Activity

People

Dates