-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Query Integration
-
Fully Compatible
-
ALL
-
v8.0
-
QI 2024-06-10, QI 2024-06-24
For legacy interleaved sections, reference object traversal does not enter into arrays. For paths that produce array elements, the SBE implementation of Path::elementsToMaterialize() doesn't know not to traverse into arrays, and so requests materialization of these elements. But the block-based decompressor never finds these elements and so returns nothing.
Even if SBE could somehow be made aware of legacy interleaved sections (this would add complexity), without changing the Path abstraction, the block-based decompressor would only be able to return the whole array, and post-processing would need to be added on the SBE side to extract the right elements.
I think that handling legacy interleaved sections in the presence of arrays is not worth the effort and added complexity.
Instead, I think we should avoid applying path-based decompression in SBE if there are any arrays in min/max. That would protect us.
An alternative solution that would allow us to cover more cases with path-based compression would be to scan the BSONColumn for interleaved sections. This could have a performance impact for the cases where path-based decompression is not applied.
As part of this fix we should also add an assertion to the past-based decompressor that if Path::elementsToMaterialize() returns (e.g) 3 elements, then BSONTraversal of the reference object should find those 3 elements.
This issue was found by the fuzzer. The base64 encoding of a BSONColumn that triggers this issue is:
8EEAAAASEo+Pj4+Pj4+Pr4+Pj4+Pj4//ChcKAP8A/wD/AP8ABEIACwAAAAr5Ci4KAAASEhISEhIAAAoACgD/AAoACgAKAAoA/wAKABAA6f//AAAA