Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.4.0-rc0, 4.7.0
Affects Version/s: None
Component/s: Storage
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.4
Sprint:
Execution Team 2020-03-23
Linked BF Score:
24
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Since bgsync aborts the index build even before transitioning to rollback state, side effect of that is really bad, as the node is still eligible to run election and become primary. One notable consequence of that behavior is that, consider a case where we have 3 node replica set. (node A is the primary and node B secondary1 and node C is secondary2) and the thread pool size is 1.

1) node A (primary for term 10) starts the index Build 'x_1', uses indexbuildCoordinator thread pool and generates startIndexBuild oplog entries to both secondaries.
2) node B and node C, on receiving the startIndexBuild starts the index build (uses indexbuildCoordinator thread pool)
3) node A faces network partition and gets disconnected from node B and node C.
4) node A receives some writes W1 at term 10 and sees it lost majority of votes and steps down.
5) Node C gets elected and becomes primary for term 11. And, node A now rejoins the n/w and sees the sync source, say, node C (new primary) has diverged from its oplog. So, it gets into this code path and starts aborting the index build. Since the node A hasn't yet transitioned to rollback, it's free to run the election and let's assume it won the election on receiving vote from node B.

As a result of step 5, node A will no longer run the real rollback step. This is because, on node A becoming primary, it stops the oplog fetcher service, so this check or [this|https://github.com/mongodb/mongo/blob/17984db6c531594c00bf226804d9ab7ed6225643/src/mongo/db/repl/rollback_impl.cpp#L190 check might fails making the node not to rollback any oplog entries.

Problems:
1) The consequence of this is that index build on secondaries becomes orphaned.
2) Since the index build on node A got aborted, the node A is free to start new index build, say, 'y_1'. If secondaries receives the startIndexBuild oplog entry for index 'y_1', the secondaries would wait for the indexBuildsCoordinator thread to become available and blocks secondary replication.

Solution: We should abort index build only when the node transitioned its state to rollback and we are sure that the entries are going to get rolled back. And, it applies to both rollback via recoverToStableTimestamp and rollback via refetch.

P.S: I noticed this failure frequently in my patch build. And, currently, since the index build is generating high volumes of timeout error. The BF stating this issue is lost.

is depended on by

SERVER-46823 Enable default for index commit quorum as "votingMembers"

Closed

related to

SERVER-46976 Enable commit quorum in rollback_waits_for_bgindex_completion.js

Closed

SERVER-48419 Extend rollback to recover resumable index builds efficiently

Closed

Assignee:: Louis Williams
Reporter:: Suganthi Mani
Participants:: Githook User, Louis Williams, Suganthi Mani
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Mar 03 2020 07:35:29 AM UTC
Updated:: Oct 29 2023 10:11:25 PM UTC
Resolved:: Mar 18 2020 05:33:49 PM UTC
Confidence Status Last Update:: 17/Mar/20 5:32 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates