Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 5.0.27, 8.1.0-rc0, 6.0.16, 7.3.3, 8.0.0-rc5, 7.0.11
Affects Version/s: 5.0.0, 6.0.0, 7.0.0, 7.2.0, 8.0.0-rc0, 7.3.0
Component/s: None
Labels:
None

Assigned Teams:

Cluster Scalability
Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v8.0, v7.3, v7.0, v6.0, v5.0
Sprint:
Cluster Scalability 2024-4-29, Cluster Scalability 2024-5-13, Cluster Scalability 2024-5-27
Linked BF Score:
105
Confidence Status:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

When the resharding coordinator aborts, it performs the following steps:
1. Transition the state document to kAbort.
2. Send the _shardsvrAbortReshardCollection to the participants
3. Proceed with cleaning up the resharding temporary collection metadata.

However, by the time (3) executes there's no guarantee that shards will have seen the transition to kAbort (1). This is because (2) only clears the filtering metadata on the primary nodes (and issues a best effort async sharding metadata refresh which in turn will also asynchronously flush the ShardServerCatalogCacheLoader). This can be problematic in case of failover to a new secondary that is not yet aware of kAbort.

One solution could be to make (2) perform this sharding metadata refresh + durably (majority) flush of the shardServerCatalogCacheLoader.
Another solution, which is more in line with other callers of _updateCoordinatorDocStateAndCatalogEntries, would be to call _tellAllDonorsToRefresh() (and _tellAllRecipientsToRefresh() too?) right after this line on the abort procedure.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

repro-server-88978.js
2 kB
Apr 04 2024 02:31:47 PM UTC

is related to

SERVER-90810 Resharding recipient shard can install stale filtering information for the resharding temporary collection when aborting

Backlog

Assignee:: Abdul Qadeer

Reporter:: Jordi Serra Torrens

Participants:: Abdul Qadeer, Githook User, Jordi Serra Torrens, Max Hirschhorn

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: Apr 04 2024 02:05:44 PM UTC

Updated:: May 23 2024 01:55:19 PM UTC

Resolved:: May 17 2024 12:04:27 AM UTC

Confidence Status Last Update:: 19/Apr/24 3:26 PM

GA Target Date:: None

Public Preview Target Date:: None

Private Preview Target Date:: None

Experiment Target Date:: None

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates