-
Type: Task
-
Resolution: Done
-
Affects Version/s: None
-
Component/s: None
-
None
If doing an insert workload in WiredTiger with an append characteristic (and not bulk loading), the tree tends to grow deeper and deeper with WT_PM_SPLIT_MERGE pages. This happens because as we load items into the tree pages hit the memory_page_max limit and we force evict them. Each time a page is force evicted we create a new split merge parent page - that increases the depth of the tree.
We have code that manages the depth of split merge sub-trees by merging them together, but ideally the tree would not grow as deep.
One possible option would be to:
- Attempt to lock the parent page when doing a forced eviction
- If we don't get the lock, do what we do now
- If we do get the lock, then split the child page as before, but update the existing parent with the new child, rather than swapping in a new split-merge page.
I've attached a picture that attempts to explain more clearly.
NOTES:
This problem isn't isolated to append only workloads - though it's simpler to reason about the problem with an append workload. Any solution should be extendable to general insert workloads (e.g: hot spots in the middle of the tree).
To reproduce trees with deep split merge sub-trees you can use the following Python code snippet:
self.session.create(self.uri, 'value_format=S,key_format=S') cursor = self.session.open_cursor(self.uri, None) for i in range(1, 10000000): cursor.set_key(key_populate(cursor, i)) cursor.set_value(('%010d' % i) * 400) cursor.insert() cursor.close()