-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication
-
None
-
Fully Compatible
-
v4.4, v4.2
-
Repl 2021-05-03, Repl 2021-05-17
-
(copied to CRM)
After a restore users generally don't need to be able to roll back or do PIT reads earlier than the top of the oplog.
Replication recovery can also be very long after a restore, and the stable/oldest timestamp cannot advance during replication recovery. This isn't great even with durable history, but can lead to very poor performance in 4.2 before durable history.
We should provide a startup parameter, that when configured, applies oplog entries either:
- without timestamps to create no history, or
- with timestamps, but advancing the stable/oldest timestamp between batches
so that the storage engine can evict history.
We may have to set the initial data timestamp at the end of recovery to prevent rollbacks or reads before the timestamp at the end of recovery. We also need to consider what happens when the nodes crashes halfway through recovery, and make sure it doesn't corrupt data in that case.
This should only be supported and used in Atlas.
Note that if a rollback were necessary to a point before the end of the recovery, the rollback would fail unrecoverably. If the restore was used to seed a new replica set, it is not expected that a node in that set would roll back to a point before the last seeded oplog entry.
Credit to lingzhi.deng for this idea.
- causes
-
SERVER-81878 startupRecoveryForRestore may not play nicely with collection drop applied during startup recovery
- Closed
-
SERVER-81879 startupRecoveryForRestore can drop tables whose catalog write is not yet checkpointed
- Closed
- related to
-
SERVER-55483 Add a new startup parameter that skips verifying the table log settings
- Closed