By default, the checkpoint cursors use the checkpoint stable timestamp as a read timestamp to read the checkpointed data along with the checkpoint transaction snapshot.
This can go wrong when a prepared transaction commits in parallel to the checkpoint. This can leave the on-disk checkpoint data may have a torn transaction information that cannot be possible to verify using the checkpoint snapshot and stable timestamp.
For example, a prepared transaction performed an update in two tables and the on-disk checkpoint may have state as follows
- Table -1 has an update on the state of preparation in progress
- Table -2 has an update on the state of committed
The above torn transaction scenario is possible when the prepared transaction is committed in parallel to the running checkpoint.
It is possible that the commit timestamp of a prepared update can be less than the checkpoint stable timestamp. Just using the commit timestamp visibility check can return inconsistent data from a checkpoint cursor.
To fix not returning inconsistent data, along with the commit timestamp visibility check, check whether the data is stable or not according to the checkpoint stable timestamp.