-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
None
-
Affects Version/s: 3.2.15
-
Component/s: None
-
Environment:QEMU/KVM Virtual Machine
Ubuntu 16.04.5
-
ALL
-
Looks like this error: https://jira.mongodb.org/browse/SERVER-26103
I am running a JuJu Controller that had 2 other host in a replication state. However, during a mishap the other two got wiped and upon rebooting and attempting to restore the database I am running in now that shows the following errors:
Feb 25 18:34:27 hqosjuju systemd[1]: Started juju state database.
Feb 25 18:34:27 hqosjuju mongod[1410]: 2019-02-25T18:34:27.614+0000 W CONTROL [main] No SSL certificate validation can be performed since no CA file has been provided; please specify an sslCAFile parameter
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] MongoDB starting : pid=1410 port=37017 dbpath=/var/lib/juju/db 64-bit host=hqosjuju
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] db version v3.2.15
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] git version: e11e3c1b9c9ce3f7b4a79493e16f5e4504e01140
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] OpenSSL version: OpenSSL 1.0.2g 1 Mar 2016
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] allocator: tcmalloc
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] modules: none
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] build environment:
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] distarch: x86_64
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] target_arch: x86_64
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] options: { net: { ipv6: true, port: 37017, ssl:Unknown macro: { PEMKeyFile}}, replication: { oplogSizeMB: 1024, replSet: "juju" }, security: { authorization: "enabled", keyFile: "/var/lib/juju/shared-secret" }, storage: { dbPath: "/var/lib/juju/db", engine: "wiredTiger", journal:
Unknown macro: { enabled}, wiredTiger: { engineConfig: { cacheSizeGB:
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] wiredtiger_open config: create,cache_size=1G,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] WiredTiger (0) [1551119667:660526][1410:0x7f4db7054bc0], file:WiredTiger.wt, connection: read checksum error for 4096B block at offset 540672: block header checksum of 1881944125 doesn't match expected checksum of 28637535
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] WiredTiger (0) [1551119667:660613][1410:0x7f4db7054bc0], file:WiredTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] WiredTiger (-31804) [1551119667:660627][1410:0x7f4db7054bc0], file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] Fatal Assertion 28558
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten]***aborting after fassert() failure
Feb 25 18:34:27 hqosjuju mongod.37017[1410]: [initandlisten] Got signal: 6 (Aborted).0x12a7701 0x12a6559 0x12a6e81 0x7f4db3ff5390 0x7f4db3c4f428 0x7f4db3c5102a 0x12209f2 0x1000efa 0x6f6234 0x6f6450 0x6f66a8 0x1356faf 0x13574fb 0x1353aed 0x13586c7 0x13721cb 0x13ab4d3 0x14357db 0x1435d1d 0x1435fdc 0x13b9bd1 0x14324f8 0x13f5c0e 0x13f5ceb 0x13a76e9 0xfe4fad 0xfdd474 0xeceb7e 0x73bd14 0x6f73a2 0x7f4db3c3a830 0x736e99
----- BEGIN BACKTRACE -----
{"backtrace":[Unknown macro: {"b"},{"b":"400000","o":"EA6559"},{"b":"400000","o":"EA6E81"},{"b":"7F4DB3FE4000","o":"11390"},{"b":"7F4DB3C1A000","o":"35428","s":"gsignal"},{"b":"7F4DB3C1A000","o":"3702A","s":"abort"},{"b":"400000","o":"E209F2","s":"ZN5mongo13fassertFailedEi"},{"b":"400000","o":"C00EFA"},{"b":"400000","o":"2F6234","s":"_wt_eventv"},{"b":"400000","o":"2F6450","s"
mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x12a7701]
mongod(+0xEA6559) [0x12a6559]
mongod(+0xEA6E81) [0x12a6e81]
libpthread.so.0(+0x11390) [0x7f4db3ff5390]
libc.so.6(gsignal+0x38) [0x7f4db3c4f428]
libc.so.6(abort+0x16A) [0x7f4db3c5102a]
mongod(_ZN5mongo13fassertFailedEi+0xA2) [0x12209f2]
mongod(+0xC00EFA) [0x1000efa]
mongod(__wt_eventv+0x3D7) [0x6f6234]
mongod(__wt_err+0x9D) [0x6f6450]
mongod(__wt_panic+0x24) [0x6f66a8]
mongod(__wt_block_extlist_read+0x8F) [0x1356faf]
mongod(__wt_block_extlist_read_avail+0x2B) [0x13574fb]
mongod(__wt_block_checkpoint_load+0x26D) [0x1353aed]
mongod(+0xF586C7) [0x13586c7]
mongod(__wt_btree_open+0xB3B) [0x13721cb]
mongod(__wt_conn_btree_open+0x163) [0x13ab4d3]
mongod(__wt_session_get_btree+0xFB) [0x14357db]
mongod(__wt_session_get_btree+0x63D) [0x1435d1d]
mongod(__wt_session_get_btree_ckpt+0x14C) [0x1435fdc]
mongod(__wt_curfile_open+0x161) [0x13b9bd1]
mongod(+0x10324F8) [0x14324f8]
mongod(__wt_metadata_cursor_open+0x6E) [0x13f5c0e]
mongod(__wt_metadata_cursor+0x4B) [0x13f5ceb]
mongod(wiredtiger_open+0x1659) [0x13a76e9]
mongod(ZN5mongo18WiredTigerKVEngineC1ERKNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_S8_mbbb+0xA6D) [0xfe4fad]
mongod(+0xBDD474) [0xfdd474]
mongod(_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv+0x3FE) [0xeceb7e]
mongod(+0x33BD14) [0x73bd14]
mongod(main+0x732) [0x6f73a2]
libc.so.6(__libc_start_main+0xF0) [0x7f4db3c3a830]
mongod(_start+0x29) [0x736e99]
----- END BACKTRACE -----
Feb 25 18:34:27 hqosjuju systemd[1]: juju-db.service: Main process exited, code=dumped, status=6/ABRT
It was running on a VM that was restarted incorrectly and caused some disk corruption, which has been since resolved. I currently have a backup of the corrupt VM, a Backup of the database which was ran before the VM was corrupted, and a semi-working restored database on a new VM.
I was able to get the database working on a new VM with the issue in juju (not mongo related) that the 509x cert is incorrect so I cannot use that server. I want to restore to the original VM (with the right certs and disk corruption issues resolved) but cannot start mongod without these issues above. It looks like the WiredTiger.wt file is corrupt. I have seen multiple forum posts and Jira issues where you guys have repaired the issue but provided no insight in to how. So...I am posting the files here.
If there is a way to restore a database without starting a database, I would love to see documentation on that, as thus far I can find none. I have both BSON files from a dump as well as a restore with all of the .wt files (a ton of them) I provided all of the WiredTiger files as most previous posts have requested.