Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 3.2.17, 3.4.9
Component/s: Replication
Labels:
None

Operating System:
ALL
Backport Requested:

v4.0, v3.6, v3.4
Steps To Reproduce:
Hide

OplogReader syncSourceReader; OpTime minValid; if (_replCoord->getMemberState().recovering()) { BatchBoundaries boundaries = getMinValid(txn); if (boundaries.end > lastOpTimeFetched) { minValid = boundaries.end; } } syncSourceReader.connectToSyncSource(txn, lastOpTimeFetched, minValid, _replCoord);

connectToSyncSource wil use db.oplog.rs.find( ts: {$gte: minValid, $lte: minvalid}) to check this oplog in sync source, which will trigger oplogStartHack.

When the minvalid optime exists in sync source, the oplogStartHack work very well; but if it doesn't exists in sync source, it will cause oplog full table scan.

I think when the minvaid doesn't exist in sync source, RecordStore::oplogStartHack should return RecordId::max() to avoid full oplog table scan, which is a very heavy operation.

diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp index f6fa6d6..94a1d48 100644 --- a/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp +++ b/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp @@ -1792,7 +1792,7 @@ boost::optional<RecordId> WiredTigerRecordStore::oplogStartHack( if (ret == 0 && cmp > 0) ret = c->prev(c); // landed one higher than startingPosition if (ret == WT_NOTFOUND) - return RecordId(); // nothing <= startingPosition + return RecordId::max(); // will not cause oplog full scan invariantWTOK(ret); int64_t key;
Show
OplogReader syncSourceReader; OpTime minValid; if (_replCoord->getMemberState().recovering()) { BatchBoundaries boundaries = getMinValid(txn); if (boundaries.end > lastOpTimeFetched) { minValid = boundaries.end; } } syncSourceReader.connectToSyncSource(txn, lastOpTimeFetched, minValid, _replCoord); connectToSyncSource wil use db.oplog.rs.find( ts: {$gte: minValid, $lte: minvalid}) to check this oplog in sync source, which will trigger oplogStartHack. When the minvalid optime exists in sync source, the oplogStartHack work very well; but if it doesn't exists in sync source, it will cause oplog full table scan. I think when the minvaid doesn't exist in sync source, RecordStore::oplogStartHack should return RecordId::max() to avoid full oplog table scan, which is a very heavy operation. diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp index f6fa6d6..94a1d48 100644 --- a/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp +++ b/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp @@ -1792,7 +1792,7 @@ boost::optional<RecordId> WiredTigerRecordStore::oplogStartHack( if (ret == 0 && cmp > 0) ret = c->prev(c); // landed one higher than startingPosition if (ret == WT_NOTFOUND) - return RecordId(); // nothing <= startingPosition + return RecordId::max(); // will not cause oplog full scan invariantWTOK(ret); int64_t key;
Sprint:
Repl 2018-10-08
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

The mongod instance in my production environtment suffers very big IO pressure, and I see a lot of oplog scan in the log like belows.

2017-10-23T17:48:01.845+0800 I COMMAND  [conn8766752] query local.oplog.rs query: { ts: { $gte: Timestamp 1505624058000|95, $lte: Timestamp 1505624058000|95 } } planSummary: COLLSCAN cursorid:20808023597 ntoreturn:0 ntoskip:0 keysExamined:0 docsExamined:44669401 keyUpdates:0 writeConflicts:0 numYields:353599 nreturned:0 reslen:20 locks:{ Global: { acquireCount: { r: 707200 } }, Database: { acquireount: { r: 353600 }, acquireWaitCount: { r: 15 }, timeAcquiringMicros: { r: 3667 } }, oplog: { acquireCount: { r: 353600 } } } 935646ms

I found one of the secondary( a hidden node) is in RECORING state, I dive into the code and find the root cause.

The hidden node has following warning log

We cannot use xxx as a sync source because it does not contain the necessary operations for us to reach a consistent state ...

When secondary choose a sync source ,it will make sure the oplog in minvalid exist in the sync source，it will send a

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Missing minvalid in sync source cause oplog full table scan all the time.md
3 kB
Oct 25 2017 11:20:58 AM UTC

duplicates

SERVER-29843 Make oplog queries with point equality on ts field use getOplogStartHack

Closed

Assignee:: Tess Avitabile (Inactive)
Reporter:: Zhang Youdong
Participants:: Tess Avitabile, Zhang Youdong
Votes:: 1 Vote for this issue
Watchers:: 11 Start watching this issue

Created:: Oct 24 2017 09:59:16 AM UTC
Updated:: Nov 15 2021 03:13:42 PM UTC
Resolved:: Oct 01 2018 07:58:48 PM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates