Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18887

WiredTiger instances taking 2 hours to startup and transition to Secondary state

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.0.0, 3.0.1, 3.0.2, 3.0.3
    • Component/s: Performance, WiredTiger
    • None
    • Environment:
      Windows 2012 R2 VM running in Azure, 7GB RAM
    • Windows
    • Hide

      Restart the Windows Server MongoDB is running on. On restart the system will take about 2 hours to get to the 'Secondary' state during which times it appears to be scanning every collection and index file (~2000 for us).

      Show
      Restart the Windows Server MongoDB is running on. On restart the system will take about 2 hours to get to the 'Secondary' state during which times it appears to be scanning every collection and index file (~2000 for us).
    • Platform 7 08/10/15

      WiredTiger instances are taking a long time to startup to the point where it becomes a Secondary or Primary. The log output up to where the long delay occurs before the transition to the Secondary/Primary state is shown below:

      2015-06-09T00:41:34.920+0000 I CONTROL ***** SERVER RESTARTED *****
      2015-06-09T00:41:35.206+0000 I CONTROL Trying to start Windows service 'MongoDB'
      2015-06-09T00:41:35.224+0000 I STORAGE Service running
      2015-06-09T00:41:35.254+0000 W - [initandlisten] Detected unclean shutdown - m:/mongodb/data\mongod.lock is not empty.
      2015-06-09T00:41:35.254+0000 W STORAGE [initandlisten] Recovering data from the last clean checkpoint.
      2015-06-09T00:41:35.255+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=1G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),

      During this time MongoDB appears to be reading every file in our database, we guess its some kind of file integrity check. I'm guessing this might be normal following an unclean shutdown in which case we need help with the shutdown issue . We've noticed a few Jira Issues relating to abnormal shutdowns due to the Windows Service manager and timeouts but they appear to be getting marked as fixed but as of yet our system keeps getting this problem. I have tried shutting down the service from a command prompt as well as the service manager and normal OS restarts but every one results in a long startup time. I'm not certain they all have produced the 'unclean shutdown' log message though.

      We are running a 5 node replica set, 2 MMAP, 2 WT and 1 arbiter. Our database size is ~280GB (MMAP) / 180GB (WT). We were planning to transition to WiredTiger but this and a few other problems are showstoppers for us at the moment.

        1. datadircontents.txt
          291 kB
          rogerdawson7
        2. dircontents.txt
          291 kB
          rogerdawson7
        3. journaldircontents.txt
          0.4 kB
          rogerdawson7
        4. mongowtstartup.log
          2.88 MB
          rogerdawson7

            Assignee:
            mark.benvenuto@mongodb.com Mark Benvenuto
            Reporter:
            roger.dawson@ga-digital.com Roger Dawson
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: