Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-58720

DropDatabaseCoordinator must not re-execute destructive logic after removing CSRS metadata

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 5.0.3, 5.1.0-rc0
    • Affects Version/s: 5.0.2
    • Component/s: Sharding
    • None
    • Fully Compatible
    • ALL
    • v5.0
    • Sharding EMEA 2021-07-26, Sharding EMEA 2021-08-09
    • 171

      The removal of a database entry from the config server marks the logical end a drop database operation but not the end of the coordinator lifetime.

      If a stepdown happens before releasing the coordinator, on the next step-up it will be resumed and re-execute all the dropDatabase logic even though - in the meantime - the database may have been recreated on a different primary shard.

      As a result, a dropDatabase followed by a createCollection (re-creating the db) may result in data loss.

      It can't be stated that the collection was recreated while dropDatabase was still running, because the following interleaving can happen:

      • The user issues dropDatabase
      • Stepdown happens right after removing CSRS metadata]
      • The user receives an error
      • The user retries dropDatabase that succeeds because the router doesn't find it on the CSRS
      • The user recreates a collection on the same database, with a different primary shard, and starts using it
      • Step-up happens, the coordinator is resumed and it removes legit data

            Assignee:
            simon.gratzer@mongodb.com Simon Gratzer (Inactive)
            Reporter:
            pierlauro.sciarelli@mongodb.com Pierlauro Sciarelli
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: