Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-38094

Send prepareTransaction with write concern that waits for majority commit point but not committed snapshot

    • Fully Compatible
    • Sharding 2018-11-19
    • 60

      As of SERVER-35811, the stable timestamp is not allowed to advance beyond the opTime of the prepare oplog entry of the oldest non-committed/aborted transaction, which means the prepareTransaction command cannot satisfy majority write concern until all earlier prepared transactions have committed or aborted (because majority wc waits for the committed snapshot to advance, which is tied to the stable timestamp). This is a problem for cross-shard transactions because the coordinator shard sends prepareTransaction to all participant shards with majority write concern and will not make a commit decision until it hears responses from each (or times out and aborts). So if two cross shard transactions involve two of the same shards and their prepare oplog entries are in different orders on those shards, neither coordinator will receive all of the responses it needs to make a decision, and the transactions will deadlock until at least one times out.

      To work around this, each transaction coordinator can send prepareTransaction with a new write concern that waits for the prepare entry to be majority committed, but not in the committed snapshot, so write concern can be satisfied without waiting for the stable timestamp to advance, but the entry still cannot be rolled back.

            Assignee:
            jack.mulrow@mongodb.com Jack Mulrow
            Reporter:
            jack.mulrow@mongodb.com Jack Mulrow
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: