Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.9.0, 4.4.5, 4.2.14, 4.0.25
Affects Version/s: None
Component/s: None
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.4, v4.2, v4.0
Sprint:
Execution Team 2021-02-08, Execution Team 2021-02-22, Execution Team 2021-03-08
Linked BF Score:
15
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Starting in ~~SERVER-25025~~, we mark a collection as always needing size adjustment only if we are in rollback or replication recovery. However, we only set the inReplicationRecovery flag once we are in ReplicationRecoveryImpl::recoverFromOplog, which occurs after the aforementioned check for replication recovery.

This causes an issue for capped collections when the number of inserts applied during replication recovery is greater than collection's max number of documents. Since we're erroneously not adjusting the count, these documents will not delete other documents already inserted during replication recovery, causing the capped collection to contain more documents than it should until collection validation is run to correct the fast count.

The following jstest reproduces the issue:

/**
 * Reproduces the issue described in SERVER-52833.
 */
(function() {
"use strict";

load("jstests/libs/fail_point_util.js");

const rst = new ReplSetTest({nodes: 1});
rst.startSet();
rst.initiate();

const primary = rst.getPrimary();
const testDB = primary.getDB("test");
const coll = testDB.getCollection(jsTestName());

assert.commandWorked(testDB.createCollection(coll.getName(), {capped: true, size: 100, max: 1}));

const ts = assert.commandWorked(testDB.runCommand({insert: coll.getName(), documents: [{a: 1}]}))
               .operationTime;
configureFailPoint(primary, "holdStableTimestampAtSpecificTimestamp", {timestamp: ts});

assert.commandWorked(coll.insert([{b: 1}, {b: 2}]));
rst.restart(primary);

rst.stopSet();
})();

related to

SERVER-34977 subtract capped deletes from fastcount during replication recovery

Closed

SERVER-25025 Improve startup time when there are tens of thousands of collections/indexes on WiredTiger

Closed

Assignee:: Gregory Noma
Reporter:: Gregory Noma
Participants:: Githook User, Gregory Noma
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Nov 12 2020 08:12:48 PM UTC
Updated:: Oct 29 2023 10:00:27 PM UTC
Resolved:: Feb 23 2021 10:35:58 PM UTC
Confidence Status Last Update:: 09/Feb/21 6:21 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates