When timestamp support was added to test/checkpoint in WT-3181, some simplifying assumptions were made about timestamp management. In particular, the test periodically moves the oldest_timestamp to 100 ticks in the past of some active transaction. In addition, the timestamp chosen for checkpoints themselves is not coordinated with either commit timestamps or the specified oldest timestamp.
All of this means that test/checkpoint behaves as expected when updates commit close to the specified order, but if a running transaction falls more than 100 ticket behind current, the test is breaking the assumptions in WiredTiger's timestamp API.
For example, this is possible:
./t -T 3 -t m -s t: process 3057 1: 1 workers, 3 tables checkpointer thread starting: tid: 3057:0x7ff6cee51700 worker thread starting: tid: 3057:0x7ff6c7fff700 Finished a checkpoint Finished verifying a checkpoint with 3 tables and 310 keys Finished a checkpoint Finished verifying a checkpoint with 3 tables and 2956 keys [1498184679:747492][3057:0x7ff6cee51700], t, WT_SESSION.checkpoint: read timestamp 13a8 older than oldest timestamp: Invalid argument t: session.checkpoint: Invalid argument Ran workers for: 0.203767 seconds Closing connection
Two main tasks:
- make test/checkpoint use timestamps robustly – updating the oldest timestamp must wait for running transactions to complete, and commit timestamps cannot be in the past of a checkpoint timestamp; and
- WiredTiger's tests for "following the timestamp API rules" are not sufficiently robust: at least in diagnostic mode, we should see errors and/or assertion failures.