Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-8899

upgrading ShardChunkManager without write lock on setShardVersion can cause errors on writes

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.5.0
    • Affects Version/s: 2.4.0-rc2
    • Component/s: Sharding
    • None
    • ALL

      When performing a write on a mongod with sharding enabled, we check the shard version of the message (handlePossibleShardedMessage()) before getting a write context (Client::Context) (see instance.cpp). Getting a write context also performs a version check. If the version changes between these checks, an error is thrown instead of the write being queued for writeback, and so the write fails.

      see http://buildlogs.mongodb.org/OS%20X%2010.5%2064-bit%20DEBUG/builds/1862/test/sharding/mrShardedOutput.js?mode=raw

       m30001| Wed Mar  6 00:14:13.584 [conn4] moveChunk migrate commit accepted by TO-shard: { active: true, ns: "test.foo", from: "localhost:30001", min: { a: 303.7011115811765 }, max: { a: 597.0397985074669 }, shardKeyPattern: { a: 1.0 }, state: "done", counts: { cloned: 759, clonedBytes: 816684, catchup: 0, steady: 0 }, ok: 1.0 }
       m30001| Wed Mar  6 00:14:13.585 [conn4] moveChunk updating self version to: 3|1||5136d0911027824dad63777b through { a: 597.0397985074669 } -> { a: 998.8787742331624 } for collection 'test.foo'
       m30000| Wed Mar  6 00:14:13.586 [conn13] info PageFaultRetryableSection will not yield, already locked upon reaching
       m30999| Wed Mar  6 00:14:13.587 [WriteBackListener-localhost:30001] new version change detected, 1 writebacks processed previously
       m30999| Wed Mar  6 00:14:13.587 [WriteBackListener-localhost:30001] writeback failed because of stale config, retrying attempts: 1
       m30999| Wed Mar  6 00:14:13.587 [WriteBackListener-localhost:30001] writeback error : { singleShard: "localhost:30001", err: "cannot queue a writeback operation to the writeback queue", code: 9517, n: 0, connectionId: 7, ok: 1.0 }
       m30001| Wed Mar  6 00:14:13.589 [conn4] about to log metadata event: { _id: "bs-mm2.local-2013-03-06T05:14:13-5136d0a5549d76650d9e1858", server: "bs-mm2.local", clientAddr: "127.0.0.1:64877", time: new Date(1362546853589), what: "moveChunk.commit", ns: "test.foo", details: { min: { a: 303.7011115811765 }, max: { a: 597.0397985074669 }, from: "shard0001", to: "shard0000" } }
       m30001| Wed Mar  6 00:14:13.589 [conn4] MigrateFromStatus::done About to acquire global write lock to exit critical section
       m30001| Wed Mar  6 00:14:13.589 [conn4] MigrateFromStatus::done Global lock acquired
       m30001| Wed Mar  6 00:14:13.589 [conn4] forking for cleanup of chunk data
       m30001| Wed Mar  6 00:14:13.589 [conn4] MigrateFromStatus::done About to acquire global write lock to exit critical section
       m30001| Wed Mar  6 00:14:13.590 [conn4] MigrateFromStatus::done Global lock acquired
       m30001| Wed Mar  6 00:14:13.590 [cleanupOldData-5136d0a5549d76650d9e1859]  (start) waiting to cleanup test.foo from { a: 303.7011115811765 } -> { a: 597.0397985074669 }, # cursors remaining: 0
       m30001| Wed Mar  6 00:14:13.591 [conn4] distributed lock 'test.foo/bs-mm2.local:30001:1362546833:2078045840' unlocked. 
       m30001| Wed Mar  6 00:14:13.591 [conn4] about to log metadata event: { _id: "bs-mm2.local-2013-03-06T05:14:13-5136d0a5549d76650d9e185a", server: "bs-mm2.local", clientAddr: "127.0.0.1:64877", time: new Date(1362546853591), what: "moveChunk.from", ns: "test.foo", details: { min: { a: 303.7011115811765 }, max: { a: 597.0397985074669 }, step1 of 6: 0, step2 of 6: 5, step3 of 6: 113, step4 of 6: 1044, step5 of 6: 64, step6 of 6: 0 } }
       m30001| Wed Mar  6 00:14:13.592 [conn4] command admin.$cmd command: { moveChunk: "test.foo", from: "localhost:30001", to: "localhost:30000", fromShard: "shard0001", toShard: "shard0000", min: { a: 303.7011115811765 }, max: { a: 597.0397985074669 }, maxChunkSizeBytes: 1048576, shardId: "test.foo-a_303.7011115811765", configdb: "localhost:30000", secondaryThrottle: true, waitForDelete: false } ntoreturn:1 keyUpdates:0 numYields: 5 locks(micros) W:85 r:21604 w:87 reslen:37 1230ms
       m30999| Wed Mar  6 00:14:13.592 [Balancer] moveChunk result: { ok: 1.0 }
       m30001| Wed Mar  6 00:14:13.611 [cleanupOldData-5136d0a5549d76650d9e1859] waiting to remove documents for test.foo from { a: 303.7011115811765 } -> { a: 597.0397985074669 }
       m30001| Wed Mar  6 00:14:13.611 [cleanupOldData-5136d0a5549d76650d9e1859] moveChunk starting delete for: test.foo from { a: 303.7011115811765 } -> { a: 597.0397985074669 }
       m30999| Wed Mar  6 00:14:13.624 [WriteBackListener-localhost:30001] ChunkManager: time to load chunks for test.foo: 34ms sequenceNumber: 8 version: 3|1||5136d0911027824dad63777b based on: 2|5||5136d0911027824dad63777b
       m30999| Wed Mar  6 00:14:13.625 [WriteBackListener-localhost:30001]     setShardVersion  shard0000 localhost:30000  test.foo  { setShardVersion: "test.foo", configdb: "localhost:30000", version: Timestamp 3000|0, versionEpoch: ObjectId('5136d0911027824dad63777b'), serverID: ObjectId('5136d08f1027824dad637779'), shard: "shard0000", shardHost: "localhost:30000" } 0x106072fc0 8
       m30999| Wed Mar  6 00:14:13.628 [conn1]     setShardVersion  shard0000 localhost:30000  test.foo  { setShardVersion: "test.foo", configdb: "localhost:30000", version: Timestamp 3000|0, versionEpoch: ObjectId('5136d0911027824dad63777b'), serverID: ObjectId('5136d08f1027824dad637779'), shard: "shard0000", shardHost: "localhost:30000" } 0x106072620 8
      
      ========> Saved total of 3000 documents
      
       m30000| Wed Mar  6 00:14:13.691 [conn6] insert test.foo keyUpdates:0 exception: [test.foo] shard version not ok in Client::Context: version mismatch detected for test.foo, stored major version 3 does not match received 2 ( ns : test.foo, received : 2|0||5136d0911027824dad63777b, wanted : 3|0||5136d0911027824dad63777b, send ) code:13388 locks(micros) w:64256 67ms
       m30999| Wed Mar  6 00:14:13.693 [WriteBackListener-localhost:30001]       setShardVersion success: { oldVersion: Timestamp 2000|0, oldVersionEpoch: ObjectId('5136d0911027824dad63777b'), ok: 1.0 }
      

      The last two lines show the result of a setShardVersion and insert hitting m30000 at the same time (shard0000).

      Possible solution - read lock during ShardingState::trySetVersion()?

        1. failpoints.patch
          2 kB
        2. race_ssv_and_insert.js
          4 kB

            Assignee:
            greg_10gen Greg Studer
            Reporter:
            greg_10gen Greg Studer
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: