Recently some questions arose concerning the interaction between timestamps and logging and whether using both at once is a viable model for applications. Commits to logged tables are always immediately durable so the concept of rollback-to-stable and a stable timestamp does not apply, but it might still make sense to manage the oldest timestamp and maintain/read from history.
However, this doesn't work. The major problem is that timestamp information is not written to the log. Thus, any value recovered from the log will lose its timestamp information, so if two successive values are committed and written to the log, and a crash happens before the next checkpoint, other history might remain but the first of the two will be lost.
There is at least one additional problem, which is that rollback-to-stable will remove history store entries associated with logged tables if they are newer than the rollback timestamp; this would need to be changed to skip over the parts of the history store associated with logged tables.
The conclusion seems to be that using timestamps and logging together is not viable and shouldn't be allowed. However, it needs to be possible to write to a logged table as part of a timestamped transaction (the oplog does this) so it isn't immediately clear what the best way to enforce this restriction is.
The leading candidate so far is to drop timestamp information from updates to logged tables at the time they're created; this has a number of benefits (e.g. nothing special needs to be done on the read path, and nothing special needs to be done to avoid moving values to the history store) but might or might not turn out to actually be workable.
The current documentation does not mention this combination (AFAIK) – the questions arose as a consequence of wondering whether it should as part of other ongoing doc changes. The docs should eventually be updated once the way forward becomes clearer.
- depends on
-
SERVER-63308 Accommodate WT-8601
- Closed
- is depended on by
-
WT-6637 Log recovery is nontimestamped and can overwrite some of the records in the checkpoint
- Closed
- is related to
-
SERVER-60037 Enable the ordered timestamp assertion in MongoDB
- Closed
- related to
-
WT-8906 Restore the change to set the btree handle logging flags
- Closed
-
WT-6639 Move checkpoint start log record to before we refresh the checkpoint snapshot
- Closed
-
WT-8793 enhance logging-based testing
- Backlog