Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16357

Chunk migration pre-commit write concern should be configurable

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Sharding
    • None

      Currently, the w:majority that occurs before a chunk migration enters the critical section is not configurable. However, there are some circumstances where w:majority is not the most appropriate write concern (specifically, with arbiters, hidden members, delayed members, and/or priority: 0 members) — even with the changes to w:majority in 2.8. In these cases an experienced and knowledgable admin may want to use an alternate pre-commit write concern — whether weaker (understanding the potentially serious consequences and risks), stronger (eg. to reduce problems with secondary reads after migrations), or using tagged write concern.

      Since the exact write concern used by secondaryThrottle can be configured, it would be similarly useful to configure the migration pre-commit write concern. However, given the serious risks associated with weakened pre-commit write concern, there should be correspondingly serious startup warnings (or similar) in this case.

      For example, consider a 5 member set with 2 data bearers and an arbiter locally, and 2 remote data bearing members (connected by a slower WAN link). The remote nodes are to quickly service queries from clients in the remote DC (which are tolerant of inconsistent reads), thus they are hidden and priority: 0, ie. they are conceptually "peripheral" to the "core" of the replset (the local DC). The problem is that in this set w:majority (w:3) cannot be serviced by the "core" of the replset alone. This means that migrations are affected by the health of the "peripheral" nodes, and may fail if those nodes are down. The intention is that it should be possible to add "peripheral" nodes to a replset in a way that does not affect its operation. A hard-coded w:majority for migration pre-commit prevents this. The user might want w:2 to be used at migration pre-commit, the same as it would be in the absence of the two peripheral nodes.

      SERVER-14403 helps, but does not solve the whole problem. If the 2 remote members were made non-voting, then w:majority would be w:2, which could be satisfied by the local data bearers. However, such a set would more fragile than the set where all members are voting. The user may legitimately want the replset to continue to have a primary after the loss of a local data bearer and arbiter, even though migrations could never safely complete in such circumstances. If the two remote nodes are non-voting, then the set will not have a primary in this case.

      Another example is that a user affected by secondary reads after migrations might prefer to specify a pre-commit write concern equal to the total number of data bearers. Although this will not remove all of the hazards of sharded secondary reads, it would at least remove those associated with repllag.

      It can also be the case that the default write concern configured for a replica set is actually stronger than w:majority, which means that migrations would actually have weaker replication guarantees than normal writes.

            Assignee:
            schwerin@mongodb.com Andy Schwerin
            Reporter:
            kevin.pulo@mongodb.com Kevin Pulo
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: