While working on WT-8074 I found that the modifications I was making to __wt_time_window_clear_obsolete caused testing fallout, I then went and tested the same on develop.
If one comments out the call to __wt_time_window_clear_obsolete on develop it will result in the following tests failing reliably:
- test_txn16
- test_tiered03
- test_import03
- test_import04
- test_import09
__wt_time_window_clear_obsolete was originally implemented as an optimization and shouldn't have any significant impact on WiredTiger's behaviour. However given the numerous failures it's clear this is not true.
I have spent some time chasing the failure on test_import03 and I think the following might be happening:
- The test imports a file into a new database and calls session->verify.
- The verify code requires exclusive access to the dhandle so it calls __wt_exclusive_handle_operation
- This then calls __checkpoint_tree which dirties the root node of the tree.
- As we're doing a closing checkpoint we call __wt_evict_file which reconciles the root node as it's been dirtied. It doesn't touch any child nodes here as they're clean.
- During the reconciliation of the root node we call __wt_cell_unpack_addr, which is followed by: __cell_unpack_window_cleanup, the latter call removes the value held in tw->newest_txn for the root node.
- Following that the root node's time-window is compared to it's child's, the child's time-window still has the newest_txn value that was cleared on the parent. As such the child's time-window is now newer than the parents. As its newest_txn is greater than it's parents.
- This fails the verification of the b-tree.
I wrote a "fix" for the test_import03 failure to get a sense of whether I was looking in the right place, the following patch (which includes the remove of the time window clearing function in rec_visbility.c) should get test_import03 to pass (but not all failing tests will pass):
diff --git a/src/evict/evict_file.c b/src/evict/evict_file.c index f6091a443..409d53758 100644 --- a/src/evict/evict_file.c +++ b/src/evict/evict_file.c @@ -46,7 +46,10 @@ __wt_evict_file(WT_SESSION_IMPL *session, WT_CACHE_OP syncop) WT_ERR(__wt_tree_walk(session, &next_ref, walk_flags)); while ((ref = next_ref) != NULL) { page = ref->page; - + if (syncop == WT_SYNC_CLOSE) { + WT_ERR(__wt_page_modify_init(session, page)); + __wt_page_modify_set(session, page); + } /* * Eviction can fail when a page in the evicted page's subtree switches state. For example, * if we don't evict a page marked empty, because we expect it to be merged into its parent, diff --git a/src/reconcile/rec_visibility.c b/src/reconcile/rec_visibility.c index 24b8e862d..6d17a1e38 100644 --- a/src/reconcile/rec_visibility.c +++ b/src/reconcile/rec_visibility.c @@ -776,8 +776,8 @@ __wt_rec_upd_select(WT_SESSION_IMPL *session, WT_RECONCILE *r, WT_INSERT *ins, W !vpack->tw.prepare && (upd_saved || F_ISSET(vpack, WT_CELL_UNPACK_OVERFLOW))) WT_ERR(__rec_append_orig_value(session, page, upd_select->upd, vpack)); - __wt_time_window_clear_obsolete( - session, &upd_select->tw, r->rec_start_oldest_id, r->rec_start_pinned_ts); + //__wt_time_window_clear_obsolete( + // session, &upd_select->tw, r->rec_start_oldest_id, r->rec_start_pinned_ts); err: __wt_scr_free(session, &tmp); return (ret);
The fix isn't a real fix as it dirties the child page manually to get the child's time window to be re-written. The actual issue may be more to do with write generations but I'm unsure at this point. It's worth noting that test_import03 is a column store test but I don't think this a column store specific issue.
The issue I saw can be described as:
- We dirty the root page
- We clean it's time-window if it's dsk write_gen is older than the session write_gen.
- If the children of the root page are clean they won't be re-written and in theory will have their time-window's made invalid with respect to their parent's.
The work on this ticket is to:
- Implement the breaking change by commenting out the call to: __wt_time_window_clear_obsolete in rec_visibility.c
- Investigate the test failures and suggest a solution.