Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-87430

Find query with skip produces incorrect results when sorted on missing field

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Execution
    • ALL
    • QE 2024-04-01

      A find query with a "skip" produces incorrect results when sorted on a missing/non-existent field. For example, on MongoDB 4.4+ both find and aggregate produce incorrect results:

      $ python repro2748.py
      MongoDB version: 4.4.19
      Find docs with a single query:
      [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
      Find docs with the same query with skip+limit:
      [0, 1, 1, 3, 3, 3, 3, 7, 7, 7, 7, 7, 7, 7, 7, 15, 15, 15, 15, 15]
      Find docs with the aggregation with skip+limit:
      [0, 1, 1, 3, 3, 3, 3, 7, 7, 7, 7, 7, 7, 7, 7, 15, 15, 15, 15, 15]
      

      On MongoDB <=4.2 find works correctly but aggregate still produces incorrect results:

      $ python repro2748.py
      MongoDB version: 4.2.24
      Find docs with a single query:
      [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
      Find docs with the same query with skip+limit:
      [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
      Find docs with the aggregation with skip+limit:
      [0, 1, 1, 3, 3, 3, 3, 7, 7, 7, 7, 7, 7, 7, 7, 15, 15, 15, 15, 15]
      

      Repro code:

      from pymongo import MongoClient
      
      client = MongoClient()
      coll = client.test.test
      version = client.server_info()['version']
      print(f'MongoDB version: {version}')
      
      coll.drop()
      coll.insert_many([{"_id": i} for i in range(20)])
      
      print('Find docs with a single query:')
      print([doc["_id"] for doc in coll.find(sort={'missing': 1})])
      
      print('Find docs with the same query with skip+limit:')
      docs = []
      for i in range(20):
          docs.append(coll.find_one(sort={'missing': 1}, skip=i))
      print([doc["_id"] for doc in docs])
      
      print('Find docs using aggregation with skip+limit:')
      docs = []
      for i in range(20):
          docs.append(list(coll.aggregate([{"$sort": {'missing': 1}}, {"$skip": i}, {"$limit": 1}]))[0])
      print([doc["_id"] for doc in docs])
      

      Note this was originally reported via a MongoEngine issue here: https://github.com/MongoEngine/mongoengine/issues/2748

            Assignee:
            ivan.fefer@mongodb.com Ivan Fefer
            Reporter:
            shane.harvey@mongodb.com Shane Harvey
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: