Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-4549

Optimize Cursor.to_list

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Unknown Unknown
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Python Drivers
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

      Optimize Cursor.to_list.

      Context

      Cursor.to_list is currently implemented like this:

          async def to_list(self) -> list[_DocumentType]:
              return [x async for x in self] 
      

      And:

          def to_list(self) -> list[_DocumentType]:
              return [x for x in self]
      

      This is expensive especially in the async case because we need to iterate (and await) on every single document in the cursor.

      It should be more efficient to build up the list via each batch, something like this:

          async def to_list(self) -> list[_DocumentType]:
              res = []
              while self.alive:
                  res.extend(await self._next_batch())
              return res
      

      Definition of done

      Implement iteration via batches and benchmark the improvement.

      Note, if we want to make the batch iteration api public we should open a new ticket.

            Assignee:
            noah.stapp@mongodb.com Noah Stapp
            Reporter:
            shane.harvey@mongodb.com Shane Harvey
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: