Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-49943

[SBE] Fix edge case bug with 'find({"a.b.c": ..})'

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.9.0
    • Affects Version/s: None
    • Component/s: Querying
    • Fully Compatible
    • ALL
    • Hide
      > db.c.find({}, {_id: 0})
      { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] }
      { "a" : [ { "b" : [ ] }, { "b" : [ { "c" : [5, 7] } ] } ] }
      > db.adminCommand({setParameter: 1, internalQueryEnableSlotBasedExecutionEngine: false})
      { "was" : false, "ok" : 1 }
      > db.c.find({"a.b.c":  {$eq: 5}}, {_id: 0})
      { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] }
      { "a" : [ { "b" : [ ] }, { "b" : [ { "c" : [5, 7] } ] } ] }
      > db.c.find({"a.b.c":  {$in: [5]}}, {_id: 0})
      { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] }
      { "a" : [ { "b" : [ ] }, { "b" : [ { "c" : [5, 7] } ] } ] }
      > db.c.find({"a.b.c":  {$size: 2}}, {_id: 0})
      { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] }
      { "a" : [ { "b" : [ ] }, { "b" : [ { "c" : [5, 7] } ] } ] }
      > db.c.find({"a.b.c":  {$elemMatch: {$eq: 5}}}, {_id: 0})
      { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] }
      { "a" : [ { "b" : [ ] }, { "b" : [ { "c" : [5, 7] } ] } ] }
      > db.adminCommand({setParameter: 1, internalQueryEnableSlotBasedExecutionEngine: true})
      { "was" : false, "ok" : 1 }
      > db.c.find({"a.b.c":  {$eq: 5}}, {_id: 0})
      { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] }
      > db.c.find({"a.b.c":  {$in: [5]}}, {_id: 0})
      { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] }
      > db.c.find({"a.b.c":  {$size: 2}}, {_id: 0})
      { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] }
      > db.c.find({"a.b.c":  {$elemMatch: {$eq: 5}}}, {_id: 0})
      { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] }
      Show
      > db.c.find({}, {_id: 0}) { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] } { "a" : [ { "b" : [ ] }, { "b" : [ { "c" : [5, 7] } ] } ] } > db.adminCommand({setParameter: 1, internalQueryEnableSlotBasedExecutionEngine: false}) { "was" : false, "ok" : 1 } > db.c.find({"a.b.c": {$eq: 5}}, {_id: 0}) { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] } { "a" : [ { "b" : [ ] }, { "b" : [ { "c" : [5, 7] } ] } ] } > db.c.find({"a.b.c": {$in: [5]}}, {_id: 0}) { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] } { "a" : [ { "b" : [ ] }, { "b" : [ { "c" : [5, 7] } ] } ] } > db.c.find({"a.b.c": {$size: 2}}, {_id: 0}) { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] } { "a" : [ { "b" : [ ] }, { "b" : [ { "c" : [5, 7] } ] } ] } > db.c.find({"a.b.c": {$elemMatch: {$eq: 5}}}, {_id: 0}) { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] } { "a" : [ { "b" : [ ] }, { "b" : [ { "c" : [5, 7] } ] } ] } > db.adminCommand({setParameter: 1, internalQueryEnableSlotBasedExecutionEngine: true}) { "was" : false, "ok" : 1 } > db.c.find({"a.b.c": {$eq: 5}}, {_id: 0}) { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] } > db.c.find({"a.b.c": {$in: [5]}}, {_id: 0}) { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] } > db.c.find({"a.b.c": {$size: 2}}, {_id: 0}) { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] } > db.c.find({"a.b.c": {$elemMatch: {$eq: 5}}}, {_id: 0}) { "a" : [ { "b" : [ { "c" : [5, 7] } ] } ] }
    • Query 2020-11-30

      While doing some manual testing with SBE mode enable, I encountered an edge case involving dot notation where find() doesn't always match a document when it should.

      See "Steps To Reproduce" for a specific example of the bug I encountered.

      Upon investigation, I found that despite all the recent work (SERVER-49686, SERVER-49723, SERVER-49819) that added fillEmpty() calls in various places in "sbe_stage_builder_filter.cpp", it appears there are still situations where "Nothing" can wreak havoc.

      In the example shown in "Steps To Reproduce", the second document in collection c has an array field 'a' that contains two objects, where the first object has a field 'b' whose value is the empty array, and where the second object has an array field 'b' containing an object with a field 'c'. The command 'find({"a.b.c": 2})' should match the second document, but it doesn't because of an issue with the outermost TraverseStage and the TraverseStage at the second level of nesting.

      Here is the SBE plan we currently generate for 'find({"a.b.c": 2})':

       

      filter {s11} 
      traverse s11 s10 s4 {s11 || s10} {s11} 
      in 
          traverse s10 s9 s5 {s10 || s9} {s10} 
          in 
              project [s9 = fillEmpty (s8, false) || fillEmpty (isArray (s6), false) && fillEmpty (s6 == 2, false)] 
              traverse s8 s7 s6 {s8 || s7} {s8} 
              in 
                  project [s7 = fillEmpty (s6 == 2, false)] 
                  limit 1 
                  coscan 
              from 
                  project [s6 = getField (s5, "c")] 
                  limit 1 
                  coscan 
          from 
              project [s5 = getField (s4, "b")] 
              limit 1 
              coscan 
      from 
          project [s4 = getField (s1, "a")] 
          scan s1 s2 [] @"f1259d4d-36c2-486b-a796-4a0df6ef1832" 

       

      The "from" clause of the TraverseStage at the second level of nesting is 's5 = getField (s4, "b")'. When processing the second document in collection c, this TraverseStage gets invoked twice. For the first invocation, field "b"s value is the empty array, and so the TraverseStage's fold expression is never executed and the TraverseStage produces Nothing. For the second invocation, field "b"s value is an array containing an object with field "c" whose value is true, and so the TraverseStage produces Boolean true (because the innermost TraverseStage will visit field "c", compare 2==2, and produce Boolean true). When the outermost TraverseStage applies its fold expression, it ANDs together Nothing and Boolean True which produces Nothing, which causes the second document in collection c to not match (even though it should match).

      Three possible ways to fix this that come to mind:
      1) Change the fold expression passed to TraverseStage to use fillEmpty()
      2) Change how TraverseStage's folding works. If TraverseStage gave us a way to specify an "initial" value, we could avoid producing Nothing in the empty array case. (For an example of folding with an "initial value", take a look at how Haskell's foldr function works: https://wiki.haskell.org/Data.Foldable.foldr ).
      3) Wrap each TraverseStage with a ProjectStage that will convert Nothing to Boolean False.

            Assignee:
            melodee.li@mongodb.com Melodee Li
            Reporter:
            andrew.paroski@mongodb.com Drew Paroski
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: