Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.0.0-rc6, 4.1.1
Affects Version/s: None
Component/s: Replication
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.0
Sprint:
Repl 2018-06-04, Repl 2018-06-18
Linked BF Score:
65
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

The following "algebra" should explain what's happening currently. One question is, do capped deletes get undone when we recover to a timestamp. Assuming they do get undone, we have the following.

itcount and fastcount at stable timestamp = A
Operations between stable timestamp and common point = B
capped deletes between stable timestamp and common point = Cd1

Operations between common point and top of oplog = Diff
capped deletes between common point and top of oplog = Cd2
itcount and fastcount when rollback begins = A + B + Diff - Cd1 - Cd2

— recover to stable timestamp ----

itcount = A
fastcount = A + B + Diff - Cd1 - Cd2

— replication recovery —

Operations during replication recovery = B
capped deletes during replication recovery = Cd3
itcount after replication recovery = A + B - Cd3
fast count = ??? (depending on collections marked for size adjustment)

— reset counts after rollback by Diff —

itcount = A + B - Cd3
fastcount = A + B + Diff - Cd1 - Cd2 - Diff = A + B - Cd1 - Cd2

This is a problem since we have no idea of Cd1 and Cd2. If, however, capped deletes are not undone (i.e. they're not timestamped), then itcount after recover to stable timestamp is A - Cd1 - Cd2, and the itcount after replication recovery = A + B - Cd1 - Cd2 - Cd3.
In that case we can just subtract Cd3 to be correct. It should be safe for capped deletes to not recover to a timestamp since users expect it to be safe for them to get aged out anyways.

We can either keep track of the capped deletions and subtract them out, or turn off capped deletion during replication recovery and do it all at once at the end.

I think this applies to both rollback and replication recovery during startup, but there may be a reason it doesn't happen at startup.

is related to

SERVER-34976 clear the "needing size adjustment" set at the beginning of replication rollback

Closed

SERVER-52833 Capped collections can contain too many documents after replication recovery

Closed

SERVER-35431 rollback does not correct sizeStorer data sizes

Backlog

related to

SERVER-35435 Renaming during replication recovery incorrectly allows size adjustments

Closed

SERVER-35483 rollback makes config.transactions fastcount inaccurate

Closed

SERVER-35052 Turn off fastcount checks on capped collections in rollback fuzzer

Closed

(1 related to)

Assignee:: Judah Schvimer
Reporter:: Judah Schvimer
Participants:: Githook User, Judah Schvimer
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: May 14 2018 01:46:48 PM UTC
Updated:: Oct 29 2023 10:31:50 PM UTC
Resolved:: Jun 12 2018 06:18:09 PM UTC
Confidence Status Last Update:: 05/Jun/18 4:52 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates