Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18477

mongod 3.0.3 start failing due to corrupted files after shutdown

    • Type: Icon: Bug Bug
    • Resolution: Incomplete
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.0.2, 3.0.3
    • Component/s: Storage, WiredTiger
    • None
    • ALL
      1. Used yum install mongodb-org to update
      2. mongod --shutdown
      3. reboot server
      4. try to restart mongod and it fails
      5. remove lock file
      6. try to repair mongod and it fails
      7. sit down and cry

      To perform an update and reboot due to RackSpace request, I ran "mongod --shutdown" to gracefully shutdown a single node replica set, and when rebooted, the instance wouldn't come up again, it fails with:

      2015-05-14T18:19:15.853+0000 I CONTROL  ***** SERVER RESTARTED *****
      2015-05-14T18:19:15.880+0000 I STORAGE  [initandlisten] exception in initAndListen: 98 Unable to create/open lock file: /var/lib/mongodb/mongod.lock errno:13 Permission denied Is a mongod instance already running?, terminating
      2015-05-14T18:19:15.880+0000 I CONTROL  [initandlisten] dbexit:  rc: 100
      2015-05-14T18:20:01.866+0000 I CONTROL  ***** SERVER RESTARTED *****
      2015-05-14T18:20:01.893+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=1G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
      2015-05-14T18:20:01.900+0000 E STORAGE  [initandlisten] WiredTiger (0) [1431627601:900293][13197:0x7f1412436c80], file:WiredTiger.wt, connection: read checksum error [4096B @ 12288, 3854199834 != 2096887615]
      2015-05-14T18:20:01.900+0000 E STORAGE  [initandlisten] WiredTiger (0) [1431627601:900344][13197:0x7f1412436c80], file:WiredTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value
      2015-05-14T18:20:01.900+0000 E STORAGE  [initandlisten] WiredTiger (-31804) [1431627601:900360][13197:0x7f1412436c80], file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
      2015-05-14T18:20:01.900+0000 I -        [initandlisten] Fatal Assertion 28558
      2015-05-14T18:20:01.911+0000 I CONTROL  [initandlisten]
       0xf51af9 0xef1831 0xed63b1 0xd7b3da 0x1380909 0x1380ac5 0x1380f64 0x12d595e 0x12d5df8 0x12d31b3 0x12d6b26 0x12eecd1 0x1316f6b 0x137fd83 0x134de4b 0x1314657 0xd65dcb 0xd63dc8 0xa81edd 0x8087f2 0x7d5414 0x7f14109efaf5 0x8065b9
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"B51AF9"},{"b":"400000","o":"AF1831"},{"b":"400000","o":"AD63B1"},{"b":"400000","o":"97B3DA"},{"b":"400000","o":"F80909"},{"b":"400000","o":"F80AC5"},{"b":"400000","o":"F80F64"},{"b":"400000","o":"ED595E"},{"b":"400000","o":"ED5DF8"},{"b":"400000","o":"ED31B3"},{"b":"400000","o":"ED6B26"},{"b":"400000","o":"EEECD1"},{"b":"400000","o":"F16F6B"},{"b":"400000","o":"F7FD83"},{"b":"400000","o":"F4DE4B"},{"b":"400000","o":"F14657"},{"b":"400000","o":"965DCB"},{"b":"400000","o":"963DC8"},{"b":"400000","o":"681EDD"},{"b":"400000","o":"4087F2"},{"b":"400000","o":"3D5414"},{"b":"7F14109CE000","o":"21AF5"},{"b":"400000","o":"4065B9"}],"processInfo":{ "mongodbVersion" : "3.0.3", "gitVersion" : "b40106b36eecd1b4407eb1ad1af6bc60593c6105", "uname" : { "sysname" : "Linux", "release" : "3.10.0-229.1.2.el7.x86_64", "version" : "#1 SMP Fri Mar 27 03:04:26 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "4A59CC17954D13BF1713A509D071A50E6BD1B3FF" }, { "b" : "7FFF1E4FE000", "elfType" : 3, "buildId" : "64DE62EAA6D0191EAD9358297D64406988D7ED66" }, { "b" : "7F141200E000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "12F30315D4F4A2FE58B1977405C8B5515861E66B" }, { "b" : "7F1411DA1000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "B54FE20525AE27B81127E04A2B006FD758E42E55" }, { "b" : "7F14119BA000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "D3ED02D380B3CDCF52EC6E23DD35CDF03B6E046A" }, { "b" : "7F14117B2000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "7376A07360DC57189A8F92B20AA4AA1CAEA80551" }, { "b" : "7F14115AE000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "4DFEE4EA9AE8FDD4C71BD4CCC0727222F19DF810" }, { "b" : "7F14112A7000", "path" : "/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "405EACD649720B8668FFBBA197CBF030A7EF6296" }, { "b" : "7F1410FA5000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "A1AA62B29765BE03A36BF927B047EEEF8696EEC6" }, { "b" : "7F1410D8F000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "5D3D7256AE68BCFF41E312A24825ED80ECA88A73" }, { "b" : "7F14109CE000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "C31FFE7942BFD77B2FCA8F9BD5709D387A86D3BC" }, { "b" : "7F141222A000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9866E1D2BA61EBB4CE4F009FACDAACC24EF3B804" }, { "b" : "7F1410782000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "34672D541C8C9C5C1C25CB4F3F332CC9D3E604AD" }, { "b" : "7F141049F000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "45CB7F6CD322F5B55FF8B635F7EC1578631CCAEA" }, { "b" : "7F141029B000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "3A1166709F88740C49E060731832E3FAD2DFB66B" }, { "b" : "7F1410069000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "23A2D854538903E2B84EF0882046DD95522C8B59" }, { "b" : "7F140FE53000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "E45643F27F3B3E960F3691AFC6EC27A98EF7B46B" }, { "b" : "7F140FC44000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "F4A3D5E7E23F871751CA8F250421F8CF83447AD2" }, { "b" : "7F140FA40000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, { "b" : "7F140F826000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "AC596E865AF0D14B10F7B707F47D2031AD6D68DC" }, { "b" : "7F140F601000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "82FF6B18E1E42825CC2D060F969479AD4AF2F62C" }, { "b" : "7F140F3A0000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "298B19C64B19995F2AA4DA7B852E90BA5302F630" }, { "b" : "7F140F17B000", "path" : "/lib64/liblzma.so.5", "elfType" : 3, "buildId" : "218D03D1F6CF1A099A4D467B5E8ECF4F2BF45750" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf51af9]
       mongod(_ZN5mongo10logContextEPKc+0xE1) [0xef1831]
       mongod(_ZN5mongo13fassertFailedEi+0x61) [0xed63b1]
       mongod(+0x97B3DA) [0xd7b3da]
       mongod(__wt_eventv+0x489) [0x1380909]
       mongod(__wt_err+0x95) [0x1380ac5]
       mongod(__wt_panic+0x24) [0x1380f64]
       mongod(__wt_block_extlist_read+0x6E) [0x12d595e]
       mongod(__wt_block_extlist_read_avail+0x28) [0x12d5df8]
       mongod(__wt_block_checkpoint_load+0x193) [0x12d31b3]
       mongod(+0xED6B26) [0x12d6b26]
       mongod(__wt_btree_open+0xAB1) [0x12eecd1]
       mongod(__wt_conn_btree_get+0x19B) [0x1316f6b]
       mongod(__wt_session_get_btree+0x343) [0x137fd83]
       mongod(__wt_metadata_open+0x2B) [0x134de4b]
       mongod(wiredtiger_open+0xCD7) [0x1314657]
       mongod(_ZN5mongo18WiredTigerKVEngineC1ERKSsS2_bb+0x2EB) [0xd65dcb]
       mongod(+0x963DC8) [0xd63dc8]
       mongod(_ZN5mongo23GlobalEnvironmentMongoD22setGlobalStorageEngineERKSs+0x30D) [0xa81edd]
       mongod(_ZN5mongo13initAndListenEi+0x422) [0x8087f2]
       mongod(main+0x134) [0x7d5414]
       libc.so.6(__libc_start_main+0xF5) [0x7f14109efaf5]
       mongod(+0x4065B9) [0x8065b9]
      -----  END BACKTRACE  -----
      2015-05-14T18:20:01.911+0000 I -        [initandlisten]
      
      ***aborting after fassert() failure
      

      Then I've tried to "mongod --repair" with the pertinent storage and dbpath parameters, and also failed:

      2015-05-14T18:23:35.103+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=1G,session_max=20000,eviction=(threads_max=4),statistics=(fast),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
      2015-05-14T18:23:35.112+0000 E STORAGE  [initandlisten] WiredTiger (0) [1431627815:112079][13295:0x7f9055a2ec80], file:WiredTiger.wt, connection: read checksum error [4096B @ 12288, 3854199834 != 2096887615]
      2015-05-14T18:23:35.112+0000 E STORAGE  [initandlisten] WiredTiger (0) [1431627815:112150][13295:0x7f9055a2ec80], file:WiredTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value
      2015-05-14T18:23:35.112+0000 E STORAGE  [initandlisten] WiredTiger (-31804) [1431627815:112164][13295:0x7f9055a2ec80], file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
      2015-05-14T18:23:35.112+0000 I -        [initandlisten] Fatal Assertion 28558
      2015-05-14T18:23:35.122+0000 I CONTROL  [initandlisten]
       0xf51af9 0xef1831 0xed63b1 0xd7b3da 0x1380909 0x1380ac5 0x1380f64 0x12d595e 0x12d5df8 0x12d31b3 0x12d6b26 0x12eecd1 0x1316f6b 0x137fd83 0x134de4b 0x1314657 0xd65dcb 0xd63dc8 0xa81edd 0x8087f2 0x7d5414 0x7f9053fe7af5 0x8065b9
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"B51AF9"},{"b":"400000","o":"AF1831"},{"b":"400000","o":"AD63B1"},{"b":"400000","o":"97B3DA"},{"b":"400000","o":"F80909"},{"b":"400000","o":"F80AC5"},{"b":"400000","o":"F80F64"},{"b":"400000","o":"ED595E"},{"b":"400000","o":"ED5DF8"},{"b":"400000","o":"ED31B3"},{"b":"400000","o":"ED6B26"},{"b":"400000","o":"EEECD1"},{"b":"400000","o":"F16F6B"},{"b":"400000","o":"F7FD83"},{"b":"400000","o":"F4DE4B"},{"b":"400000","o":"F14657"},{"b":"400000","o":"965DCB"},{"b":"400000","o":"963DC8"},{"b":"400000","o":"681EDD"},{"b":"400000","o":"4087F2"},{"b":"400000","o":"3D5414"},{"b":"7F9053FC6000","o":"21AF5"},{"b":"400000","o":"4065B9"}],"processInfo":{ "mongodbVersion" : "3.0.3", "gitVersion" : "b40106b36eecd1b4407eb1ad1af6bc60593c6105", "uname" : { "sysname" : "Linux", "release" : "3.10.0-229.1.2.el7.x86_64", "version" : "#1 SMP Fri Mar 27 03:04:26 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "4A59CC17954D13BF1713A509D071A50E6BD1B3FF" }, { "b" : "7FFFF66FE000", "elfType" : 3, "buildId" : "64DE62EAA6D0191EAD9358297D64406988D7ED66" }, { "b" : "7F9055606000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "12F30315D4F4A2FE58B1977405C8B5515861E66B" }, { "b" : "7F9055399000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "B54FE20525AE27B81127E04A2B006FD758E42E55" }, { "b" : "7F9054FB2000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "D3ED02D380B3CDCF52EC6E23DD35CDF03B6E046A" }, { "b" : "7F9054DAA000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "7376A07360DC57189A8F92B20AA4AA1CAEA80551" }, { "b" : "7F9054BA6000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "4DFEE4EA9AE8FDD4C71BD4CCC0727222F19DF810" }, { "b" : "7F905489F000", "path" : "/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "405EACD649720B8668FFBBA197CBF030A7EF6296" }, { "b" : "7F905459D000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "A1AA62B29765BE03A36BF927B047EEEF8696EEC6" }, { "b" : "7F9054387000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "5D3D7256AE68BCFF41E312A24825ED80ECA88A73" }, { "b" : "7F9053FC6000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "C31FFE7942BFD77B2FCA8F9BD5709D387A86D3BC" }, { "b" : "7F9055822000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9866E1D2BA61EBB4CE4F009FACDAACC24EF3B804" }, { "b" : "7F9053D7A000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "34672D541C8C9C5C1C25CB4F3F332CC9D3E604AD" }, { "b" : "7F9053A97000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "45CB7F6CD322F5B55FF8B635F7EC1578631CCAEA" }, { "b" : "7F9053893000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "3A1166709F88740C49E060731832E3FAD2DFB66B" }, { "b" : "7F9053661000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "23A2D854538903E2B84EF0882046DD95522C8B59" }, { "b" : "7F905344B000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "E45643F27F3B3E960F3691AFC6EC27A98EF7B46B" }, { "b" : "7F905323C000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "F4A3D5E7E23F871751CA8F250421F8CF83447AD2" }, { "b" : "7F9053038000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, { "b" : "7F9052E1E000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "AC596E865AF0D14B10F7B707F47D2031AD6D68DC" }, { "b" : "7F9052BF9000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "82FF6B18E1E42825CC2D060F969479AD4AF2F62C" }, { "b" : "7F9052998000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "298B19C64B19995F2AA4DA7B852E90BA5302F630" }, { "b" : "7F9052773000", "path" : "/lib64/liblzma.so.5", "elfType" : 3, "buildId" : "218D03D1F6CF1A099A4D467B5E8ECF4F2BF45750" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf51af9]
       mongod(_ZN5mongo10logContextEPKc+0xE1) [0xef1831]
       mongod(_ZN5mongo13fassertFailedEi+0x61) [0xed63b1]
       mongod(+0x97B3DA) [0xd7b3da]
       mongod(__wt_eventv+0x489) [0x1380909]
       mongod(__wt_err+0x95) [0x1380ac5]
       mongod(__wt_panic+0x24) [0x1380f64]
       mongod(__wt_block_extlist_read+0x6E) [0x12d595e]
       mongod(__wt_block_extlist_read_avail+0x28) [0x12d5df8]
       mongod(__wt_block_checkpoint_load+0x193) [0x12d31b3]
       mongod(+0xED6B26) [0x12d6b26]
       mongod(__wt_btree_open+0xAB1) [0x12eecd1]
       mongod(__wt_conn_btree_get+0x19B) [0x1316f6b]
       mongod(__wt_session_get_btree+0x343) [0x137fd83]
       mongod(__wt_metadata_open+0x2B) [0x134de4b]
       mongod(wiredtiger_open+0xCD7) [0x1314657]
       mongod(_ZN5mongo18WiredTigerKVEngineC1ERKSsS2_bb+0x2EB) [0xd65dcb]
       mongod(+0x963DC8) [0xd63dc8]
       mongod(_ZN5mongo23GlobalEnvironmentMongoD22setGlobalStorageEngineERKSs+0x30D) [0xa81edd]
       mongod(_ZN5mongo13initAndListenEi+0x422) [0x8087f2]
       mongod(main+0x134) [0x7d5414]
       libc.so.6(__libc_start_main+0xF5) [0x7f9053fe7af5]
       mongod(+0x4065B9) [0x8065b9]
      -----  END BACKTRACE  -----
      2015-05-14T18:23:35.123+0000 I -        [initandlisten]
      
      ***aborting after fassert() failure
      

      How can I fix the corrupted files?, I ask this because I've seen other issues where you ask for the files to fix them and attach them to the issue, but I have several servers with mongo and that could be a problem, besides RackSpace is requesting reboot (or they will do it) for most of my servers for tomorrow night.

      I know this issue was being found when mongod process was unexpectedly stopped, but in my case I think it shouldn't have happened by using "mongod --shutdown", right?

            Assignee:
            ramon.fernandez@mongodb.com Ramon Fernandez Marina
            Reporter:
            jdiego@digital-legends.com Juan Manuel Diego G
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: