-
Type: Improvement
-
Resolution: Won't Do
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Cursors
-
2023-07-11 WiredTractor
-
Not Needed
Summary
Cursor::next() seems to be slower than it needs to be. In some workloads it can dominate performance. One possible improvement would be directly iterating the raw cell format in clean leaf pages' images rather than walking the indexing structure build up on top of it. In addition to likely being faster to iterate, it will also save time on page-in if we lazily built up the indexing structures when they are first needed.
Motivation
- Does this affect any team outside of WT?
Who doesn't benefit from a faster cursor?
- How likely is it that this use case or problem will occur?
This will be extremely common for expected column store index workloads.
- If the problem does occur, what are the consequences and how severe are they?
Lower performance than appearently possible.
- Is this issue urgent?
No
Acceptance Criteria (Definition of Done)
(When will this ticket be considered done? What is the acceptance criteria for this ticket to be closed?)
- Testing
I think this only needs perf testing. Existing perf tests may be sufficient if there are any that hammer __curfile_next, especially with very small kv pairs.
- Documentation update
There should be no user-visible behavior differences.
[Optional] Suggested Solution
Cursor next should check an atomic to see if the page is still clean, and if it is, it should just read the next cells in the page image to construct the key and form a reference to the value. This is instead of iterating the indexing structures.