The assert __rollback_ondisk_fixup_key, 455: hs_stop_durable_ts <= newer_hs_durable_ts || hs_start_ts == hs_stop_durable_ts || hs_start_ts == newer_hs_durable_ts || first_record fires in recovery in test/checkpoont.
The core dump gives:
(gdb) p hs_stop_durable_ts
$1 = 18446744073709551615
(gdb) p unpack.tw
$2 = {durable_start_ts = 1074, start_ts = 1074, start_txn = 20762, durable_stop_ts = 1074, stop_ts = 1074, stop_txn = 20762, prepare = 0 '\000'}
(gdb) p newer_hs_durable_ts
$3 = 1074
(gdb)
The newer_hs_durable_ts has the same timestamp as the onpage value.
The root cause is a race of committing prepared update and checkpoint.
The sequence is:
The user thread commits a prepared update.
It marks the update as resolved.
Another user thread add an update to the key.
Checkpoint writes the new update to the data store and the just committed prepared update to the history store.
Checkpoint checkpoints the history store with the update older than the prepared update still with a max timestamp.
We go and fix the max timestamp only after checkpoint has visited that history store page.
- related to
-
WT-7958 Include recovery in test/checkpoint
- Closed