Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14766

Indexed queries should not miss documents where neither the queried nor indexed fields change during the life of the query.

    • Type: Icon: Bug Bug
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: MMAPv1, Querying
    • None
    • Query
    • ALL
    • Hide

      Example/Repro

      function repro() {
      
          var k2="";
          for(i=0;i<2048;i++){
              k2 = k2+"_"
          }
      
          db.c.drop()
          // insert a "large" document
          db.c.insert({_id:-1, letter:'a',pad:k2})
          // insert 5 records
          for (var i=0; i<5; i++)
              db.c.insert({_id:i, letter:'a'})
          db.c.ensureIndex({letter:1})
      
          // remove the large document to free up space at the begining
          db.c.remove({_id:-1})
      
          // start query, fetch first batch of 2
          cursor = db.c.find().sort({letter:1}).batchSize(2)
          print('got', cursor.next()._id)
      
          // server cursor is now pointing to {_id:2} waiting for our getmore
          // Increase the size of {_id:3} so it moves back (to where {_id:-1} was)
          // any document with _id >2 will work in this repro
          db.c.update({_id:3}, {$set: {pad:k2}})
      
          // use our cursor to get the rest; note that {_id:3} is omitted
          while (cursor.hasNext())
              print('got', cursor.next()._id)
      }
      
      Show
      Example/Repro function repro() { var k2=""; for (i=0;i<2048;i++){ k2 = k2+ "_" } db.c.drop() // insert a "large" document db.c.insert({_id:-1, letter: 'a' ,pad:k2}) // insert 5 records for ( var i=0; i<5; i++) db.c.insert({_id:i, letter: 'a' }) db.c.ensureIndex({letter:1}) // remove the large document to free up space at the begining db.c.remove({_id:-1}) // start query, fetch first batch of 2 cursor = db.c.find().sort({letter:1}).batchSize(2) print( 'got' , cursor.next()._id) // server cursor is now pointing to {_id:2} waiting for our getmore // Increase the size of {_id:3} so it moves back (to where {_id:-1} was) // any document with _id >2 will work in this repro db.c.update({_id:3}, {$set: {pad:k2}}) // use our cursor to get the rest ; note that {_id:3} is omitted while (cursor.hasNext()) print( 'got' , cursor.next()._id) }

      Description

      This behavior is only observable in MMAPV1 storage engine

      Desired Behavior

      If an indexed query runs while documents are updated, which moves them, it is possible for those documents to be missing from the results when using MMAPV1 storage engine. We would like this behavior to change so that all matching documents which exist through the lifetime of the query are returned, even if they are updated. In particular we only expect this behavior when those updated documents have values updates in the query which aren't changed, so the query matches the document in all updated states.

      Example

      See code below

      • Add a large document, followed by 5 small documents all with {letter: "a"}
      • Add an index on {letter: 1}
      • Remove the large document
      • Start a query, batch size 2, using index
      • Update 3rd document to cause it to move to empty space left by large, removed document.

      Technical Details

      When a query walks an index in MMAPV1 it is possible for documents to move behind the current position as document location is stored in the index in MMAPV1 as the cursor moves forward resulting in documents being "missed".

      This behavior cannot be reproduced in WiredTiger, inMemory or encrypted storage engines.

            Assignee:
            backlog-server-query Backlog - Query Team (Inactive)
            Reporter:
            alan.spencer Alan Spencer
            Votes:
            7 Vote for this issue
            Watchers:
            30 Start watching this issue

              Created:
              Updated:
              Resolved: