Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-100683

Stop using undefined BSON type in indexes to represent empty arrays

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Query Optimization

      How should the MongoDB index format deal with the possibility of an empty array? Consider an example such as the following:

      db.c.insert({x: []});
      db.c.createIndex({x: 1});
      

      Index version v:2 and all prior index versions generate a key for this document containing the undefined BSON type. This is problematic for two reasons:

      • The undefined BSON type has been deprecated for a long time. The internals of the system shouldn't depend on the existence of a deprecated BSON type.
      • It means that the index format cannot distinguish between an empty array and a literal undefined value. This requires queries to do a fetch and a re-application of a predicate in order to handle certain kinds of predicates.

      When we introduce the v:3 index format, we should make the new format use an explicit sentinel to represent the empty array case. This is similar to SERVER-12869, a flaw in which v:2 indexes do not distinguish between null and missing values. We need to augment the index format to represent these two special cases outside of the BSON type system. That is, we need special index key values to represent missing and empty array, even though these concepts do not exist in the BSON format itself.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            david.storch@mongodb.com David Storch
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated: