-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Index Builds, Query Planning
-
Query Optimization
How should the MongoDB index format deal with the possibility of an empty array? Consider an example such as the following:
db.c.insert({x: []}); db.c.createIndex({x: 1});
Index version v:2 and all prior index versions generate a key for this document containing the undefined BSON type. This is problematic for two reasons:
- The undefined BSON type has been deprecated for a long time. The internals of the system shouldn't depend on the existence of a deprecated BSON type.
- It means that the index format cannot distinguish between an empty array and a literal undefined value. This requires queries to do a fetch and a re-application of a predicate in order to handle certain kinds of predicates.
When we introduce the v:3 index format, we should make the new format use an explicit sentinel to represent the empty array case. This is similar to SERVER-12869, a flaw in which v:2 indexes do not distinguish between null and missing values. We need to augment the index format to represent these two special cases outside of the BSON type system. That is, we need special index key values to represent missing and empty array, even though these concepts do not exist in the BSON format itself.
- related to
-
SERVER-12869 Index null values and missing values differently
-
- Backlog
-