Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 8.1.0-rc0, 8.0.0-rc10
Affects Version/s: None
Component/s: None
Labels:
- auto-reverted

Assigned Teams:

Service Arch
Backwards Compatibility:
Minor Change
Operating System:
ALL
Backport Requested:

v8.0, v7.0, v6.0
Sprint:
Networking & Obs 2024-06-10, Networking & Obs 2024-06-24
Linked BF Score:
200
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

The following can happen:
1. mongos sends a command to a config server or shard s0
2. As part of processing the command, s0 will run a subcommand against remote shard s1
3. s1 steps down
4. the command returns InterruptedDueToReplStateChange upstream in the path of s1 -> s0 -> mongos
5. the mongos gets InterruptedDueToReplStateChange from s0 and think it's the one that failed over.
6. mongos RSM marks s0 as ReplicaSetWithNoPrimary
7. since InterruptedDueToReplStateChange is a retriable error, the mongos will resend the command. The mongos will try to send a hello command to get the updated view of the topology, but sees there's already an outstanding request.
8. The command will be unable to retry until the outstanding hello on s0 returns, which will be up to 10s (the timeout of a streamable hello command).

Assignee:: Amirsaman Memaripour
Reporter:: Jason Chan
Participants:: Amirsaman Memaripour, Githook User, Jason Chan
Votes:: 0 Vote for this issue
Watchers:: 8 Start watching this issue

Created:: Jun 07 2024 10:10:36 PM UTC
Updated:: Jun 26 2024 02:15:32 PM UTC
Resolved:: Jun 24 2024 02:23:59 PM UTC
Confidence Status Last Update:: 10/Jun/24 9:43 PM

Details

Description

Attachments

Activity

People

Dates