Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-98657

Parallelize listCollection calls under $listClusterCatalog

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Catalog and Routing
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The $_internalListCollections implementation works in the following way:

      1. First, all the databases of the cluster are fetched.
      2. Later, a listCollections command is executed per every database.

      Right now, the listCollections calls are serialized one after the other.

      This behavior can be improved by opening a cursor per every listCollections call at the first iteration of the aggregation stage, and the following iterations will just need to consume those cursors.

      This was part of the first implementation plan of $_internalListCollections, but it was revoked because there wasn't an easy way to know which was the database associated with a specific cursor. Knowing the database of a cursor is crucial since the collections returned by listCollections don't include the database in its name.

      The goal of this ticket is to investigate and implement a way to parallelize all the listCollection calls executed by the $_internalListCollections aggregation stage.

            Assignee:
            Unassigned Unassigned
            Reporter:
            silvia.surroca@mongodb.com Silvia Surroca
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              None
              None
              None
              None