Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 5.0.2, 5.1.0-rc0, 4.4.16, 4.2.22
Affects Version/s: None
Component/s: Sharding
Labels:
- SSCCL-BUG

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.4, v4.2, v4.0
Sprint:
Sharding EMEA 2021-07-26
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

The ShardServerCatalogCacheLoader doesn't interrupt ongoing operations on step up as opposed to on step down. This could lead us to a deadlock (with timeout): if a node is a secondary node at this point but afterwards it is elected as primary, it might end up issuing a _flushRoutingTableCacheUpdates against itself. This could lead us to a deadlock if the current refresh was coming from the RecoverRefreshThread: the issued _flushRoutingTableCacheUpdates comand will end up waiting until the current refresh is completed but the current refresh cannot be completed because it is waiting for the completion of the _flushRoutingTableCacheUpdates command.

related to

SERVER-45646 If a filtering metadata refresh is scheduled while a node is secondary and there is no primary, then the node becomes primary, the refresh can deadlock with itself

Closed

Assignee:: Sergi Mateo Bellido
Reporter:: Sergi Mateo Bellido
Participants:: Githook User, Sergi Mateo Bellido, Vivian Ge
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Jul 22 2021 06:35:11 AM UTC
Updated:: Oct 29 2023 09:50:34 PM UTC
Resolved:: Jul 23 2021 10:57:08 AM UTC
Confidence Status Last Update:: 22/Jul/21 6:36 AM

Details

Description

Attachments

Issue Links

Activity

People

Dates