Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- repl-modularity

Assigned Teams:

Replication
Confidence Status:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Part of the server FTDC gathering flow returns the "latestOptime" as part of the "oplog" statistics.

The code for that is here: https://github.com/10gen/mongo/blob/7577170e2018c7d4ebbbae318b86935f945761da/src/mongo/db/repl/replication_info.cpp#L295

getMyLastAppliedOpTime takes a mutex:

OpTime ReplicationCoordinatorImpl::getMyLastAppliedOpTime() const {
    stdx::lock_guard<Latch> lock(_mutex);
    return _getMyLastAppliedOpTime_inlock();
}

Which could take any amount of time. I reproduced a 20 second stall in this section of code on 5.0.26 unintentionally while trying to reproduce a separate FTDC stall issue: ~~SERVER-93120~~. Given that the lastAppliedOpTime data is being collected for statistical purposes I am curious if this can be done outside of a lock, or using atomics.

is related to

SERVER-93120 FTDC collection blocked on locked backupCursor state read

Closed

Assignee:: Unassigned

Reporter:: Luke Pearson

Participants:: Luke Pearson

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Created:: Aug 01 2024 08:02:21 PM UTC

Updated:: Aug 05 2024 05:07:18 PM UTC

GA Target Date:: None

Public Preview Target Date:: None

Private Preview Target Date:: None

Experiment Target Date:: None

Details

Description

Attachments

Issue Links

Activity

People

Dates