Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-94795

Nullish field values cause fields to disappear even in cases where they might be needed

    • Query Optimization
    • v8.0
    • QO 2024-09-16
    • 200

      For example, one of the accepted formats for $arrayToObject expects a document input containing a k and a v field. Something like

      {$arrayToObject: [{{}k: "f1", v: "$val"{}}, {{}k: "f2", v: "$absent"{}}]}

      might resolve as

      [{{}k: "f1", v: "a value"{}}, {{}k: "f2", v: "sometimes"{}}]

      and produce (kind of like a projection) something like

      {f1: "a value", f2: "sometimes"}

      However, if one of the v values is EOO, then it can be absent from the document input, which raises a uassert. For example, if $val is not present in the document passed into the operator, it would resolve as
      [{{}k: "f1", v: EOO{}}, {{}k: "f2", v: "sometimes"{}}]

      In the implementation of $arrayToObject, there is a check to validate the preconditions of the $arrayToObject spec: that if it is passed an array of objects, the objects all contain exactly two keys, one k and one v.
      1. Check if doc has exactly 2 fields.
      2. Check that the k and v fields aren't missing.

      However, these checks are applied to the resolved objects, so in the case of the

      [{{}k: "f1", v: EOO{}}, {{}k: "f2", v: "sometimes"{}}]

      the first check will fail on

      {k: "f1", v: EOO}

      because v, being of type EOO, is considered logically absent and the object logically only has 1 field: k. The error that the user receives is a uassert with a message indicating that the operator "expects exactly 2 fields but found only 1" – which can be confusing because the user clearly passed in objects with the correct fields.

      There are a couple of ways to possibly rectify this problem, including the error message.
      1. Keep the current behavior and update the error message to indicate that the v argument resolved to a missing value. Update the documentation to note that a v argument resolving to a missing value is undefined behavior.
      2. Accept logically missing v values (as is the case here), reject physically missing v values (as would be the case if the user forgot the v parameter entirely), and propagate the missing value, e.g. produce {f1: EOO, f2: "sometimes"}.
      3. Accept logically missing v values (as is the case here), reject physically missing v values, and drop the missing fields from the produced object, e.g. produce {f2: "sometimes"}.
      4. Replace logically missing v values (as is the case here) with null and reject physically missing v values.

      The linked PR (subsequently reverted) implements #2 above for the classic executor, but I could not figure out how to make a similar change for the SBE executor: specifically, I could not figure out how to distinguish between physically missing and logically missing fields in an object. We should determine which of the above we want to go with long term and how to implement it correctly everywhere, if needed.

            Assignee:
            Unassigned Unassigned
            Reporter:
            william.qian@mongodb.com William Qian
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: