-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication
-
None
-
Fully Compatible
-
ALL
-
v4.0
-
Repl 2018-06-04, Repl 2018-06-18
-
65
The following "algebra" should explain what's happening currently. One question is, do capped deletes get undone when we recover to a timestamp. Assuming they do get undone, we have the following.
itcount and fastcount at stable timestamp = A
Operations between stable timestamp and common point = B
capped deletes between stable timestamp and common point = Cd1
Operations between common point and top of oplog = Diff
capped deletes between common point and top of oplog = Cd2
itcount and fastcount when rollback begins = A + B + Diff - Cd1 - Cd2
— recover to stable timestamp ----
itcount = A
fastcount = A + B + Diff - Cd1 - Cd2
— replication recovery —
Operations during replication recovery = B
capped deletes during replication recovery = Cd3
itcount after replication recovery = A + B - Cd3
fast count = ??? (depending on collections marked for size adjustment)
— reset counts after rollback by Diff —
itcount = A + B - Cd3
fastcount = A + B + Diff - Cd1 - Cd2 - Diff = A + B - Cd1 - Cd2
This is a problem since we have no idea of Cd1 and Cd2. If, however, capped deletes are not undone (i.e. they're not timestamped), then itcount after recover to stable timestamp is A - Cd1 - Cd2, and the itcount after replication recovery = A + B - Cd1 - Cd2 - Cd3.
In that case we can just subtract Cd3 to be correct. It should be safe for capped deletes to not recover to a timestamp since users expect it to be safe for them to get aged out anyways.
We can either keep track of the capped deletions and subtract them out, or turn off capped deletion during replication recovery and do it all at once at the end.
I think this applies to both rollback and replication recovery during startup, but there may be a reason it doesn't happen at startup.
- is related to
-
SERVER-34976 clear the "needing size adjustment" set at the beginning of replication rollback
- Closed
-
SERVER-52833 Capped collections can contain too many documents after replication recovery
- Closed
-
SERVER-35431 rollback does not correct sizeStorer data sizes
- Backlog
- related to
-
SERVER-35435 Renaming during replication recovery incorrectly allows size adjustments
- Closed
-
SERVER-35483 rollback makes config.transactions fastcount inaccurate
- Closed
-
SERVER-35052 Turn off fastcount checks on capped collections in rollback fuzzer
- Closed