-
Type: New Feature
-
Resolution: Won't Do
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
The interface I've thought about is an application callback that would amalgamate a list of values for a key into a single value. For example, an application could insert integers, and the callback could sum the list of values to calculate the final result.
There are many other use cases than summing integers, and by keeping the list in version order, the operations don't actually have to commute: applications could use tombstone values or otherwise combine values, as long as the result is deterministic for a given sequence of values regardless of how they are sliced and amalgamated.
This callback would be easy enough to apply during an LSM merge operation, rather than just taking the most recent value as we do currently.
This does bring up a few questions, though:
- in the case of WiredTiger, we would have to deal with the case where a value already exists in level 0 – this might mean amalgamating during an insert. There wouldn't be any reads to other levels, but it wouldn't be a pure append. That said, in WiredTiger, we are already using a skiplist to order level 0, so this shouldn't be a big overhead.
- it would be possible to either return the most recently amalgamated fragment, or (more useful in general) to amalgamate values across all levels in the tree on every read. The problem with the latter is that the average case for reads would become the worst case: we can't stop as soon as a value is found the way we do now. Is that a problem?