Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-6984

Initial sync can fail, or break future replication, when updates shrink or grow docs in capped collections

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.2.0
    • Component/s: Storage
    • Storage Execution
    • Minor Change
    • ALL

      Right now we allow updates on docs in capped collections, as long as docs don't grow past the size of their initial allocation. However, during initial sync, the data is cloned but the initial allocation size is lost on the secondary. So if inserts or updates which affect document size occur during the cloning process, when replaying the docs you can get an error message saying "objects in a capped ns cannot grow."

      For instance, this happens if the following sequence of ops happens during initial sync:

      • db.foo.insert( { a : 1, b : "big "}

        )

      • db.foo.update( { a : 1 }

        , {$unset : {b : 1}})

      If the smaller version of the doc is cloned during the initial sync, you will get an error message at the end of the initial sync when it goes to apply the ops:

      Sun Sep  9 19:52:48 [repl writer worker 1] ERROR: exception: failing update: objects in a capped ns cannot grow on: { ts: Timestamp 1347234740000|1, h: -4246821095103890152, op: "i", ns: "test.zzz", o: { _id: ObjectId('504d2bb46bbaa186f1cc7566'), a: 1.0, b: "big" } }
      Sun Sep  9 19:52:48 [repl writer worker 1]   Fatal Assertion 16361
      0x109c45b1b 0x10a09bde7 0x10a072239 0x109df1bc8 0x109e2c915 0x7fff8c2c5782 0x7fff8c2b21c1 
       0   mongod                              0x0000000109c45b1b _ZN5mongo15printStackTraceERSo + 43
       1   mongod                              0x000000010a09bde7 _ZN5mongo13fassertFailedEi + 151
       2   mongod                              0x000000010a072239 _ZN5mongo7replset21multiInitialSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE + 393
       3   mongod                              0x0000000109df1bc8 _ZN5mongo10threadpool6Worker4loopEv + 138
       4   mongod                              0x0000000109e2c915 thread_proxy + 229
       5   libsystem_c.dylib                   0x00007fff8c2c5782 _pthread_start + 327
       6   libsystem_c.dylib                   0x00007fff8c2b21c1 thread_start + 13
      Sun Sep  9 19:52:48 [repl writer worker 1] 
      
      ***aborting after fassert() failure
      

      Even if initial sync manages to succeed, it's possible that future updates which grow a document on the primary will break replication, because there is no space for the doc to grow on the secondary. I've verified that this can occur, and the error message is similar to above.

            Assignee:
            backlog-server-execution [DO NOT USE] Backlog - Storage Execution Team
            Reporter:
            matulef Kevin Matulef
            Votes:
            11 Vote for this issue
            Watchers:
            23 Start watching this issue

              Created:
              Updated:
              Resolved: