-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
Affects Version/s: 3.0.1
-
Component/s: WiredTiger
-
Fully Compatible
-
ALL
ISSUE SUMMARY
An interaction between the validation operation with the WiredTiger storage engine and the underlying system power cycling leads to a situation where MongoDB refuses to start because of data corruption.
USER IMPACT
mongod may refuse to start due to data corruption caused by the power cyle.
WORKAROUNDS
Omit use of validate.
AFFECTED VERSIONS
3.0.1
FIX VERSION
The fix is included in the 3.0.2 production release.
Original description
Reproduce as follows. WARNING: this code will crash your machine.
echo === create db rm -rf db; mkdir -p db mongod --dbpath db --storageEngine wiredTiger --logpath db.log --fork --syncdelay 0 sleep 1; sync # work around SERVER-17571 mongo --quiet --eval 'db.c.insert({x: 0})' echo === shut down killall -w mongod echo === validate/update loop mongod --dbpath db --storageEngine wiredTiger --logpath db.log --fork --syncdelay 0 mongo --quiet --eval ' for (var i=0; i<10; i++) { db.c.validate() db.c.update({}, {$inc: {x: 1}}) } ' # crash sudo bash -c 'echo 1 >/proc/sys/kernel/sysrq' sudo bash -c 'echo b >/proc/sysrq-trigger'
On subsequent startup recovery fails and mongod terminates:
2015-03-13T12:24:21.255-0400 E STORAGE [initandlisten] WiredTiger (0) [1426263861:255456][2012:0x7f8fbc47cb80], file:collection-2-8220618549132206910.wt, session.open_cursor: read checksum error [4096B @ 8192, 3367698403 != 1510178857] 2015-03-13T12:24:21.255-0400 E STORAGE [initandlisten] WiredTiger (0) [1426263861:255578][2012:0x7f8fbc47cb80], file:collection-2-8220618549132206910.wt, session.open_cursor: collection-2-8220618549132206910.wt: encountered an illegal file format or internal value 2015-03-13T12:24:21.255-0400 E STORAGE [initandlisten] WiredTiger (-31804) [1426263861:255598][2012:0x7f8fbc47cb80], file:collection-2-8220618549132206910.wt, session.open_cursor: the process must exit and restart: WT_PANIC: WiredTiger library panic 2015-03-13T12:24:21.255-0400 I - [initandlisten] Fatal Assertion 28558 2015-03-13T12:24:21.269-0400 I CONTROL [initandlisten] 0xf4fd39 0xef9f81 0xeddb71 0xd78fda 0x13807f0 0x1380ab5 0x1380f54 0x12d9112 0x12f0ade 0x12ee878 0x12ef752 0x131779b 0x137fe68 0x137ff7a 0x1324cf1 0x137e26a 0x1335e33 0x137e0b1 0x137e4c5 0xd73f86 0xd6d4cb 0xd67264 0xd67bd3 0xd67ed9 0xd61d32 0xce21a6 0xce52dc 0xd60ba6 0xa6f8bd 0x7e20c0 0x7e7704 0x7f8fbb07afe0 0x7e02c9 ----- BEGIN BACKTRACE ----- {"backtrace":[{"b":"400000","o":"B4FD39"},{"b":"400000","o":"AF9F81"},{"b":"400000","o":"ADDB71"},{"b":"400000","o":"978FDA"},{"b":"400000","o":"F807F0"},{"b":"400000","o":"F80AB5"},{"b":"400000","o":"F80F54"},{"b":"400000","o":"ED9112"},{"b":"400000","o":"EF0ADE"},{"b":"400000","o":"EEE878"},{"b":"400000","o":"EEF752"},{"b":"400000","o":"F1779B"},{"b":"400000","o":"F7FE68"},{"b":"400000","o":"F7FF7A"},{"b":"400000","o":"F24CF1"},{"b":"400000","o":"F7E26A"},{"b":"400000","o":"F35E33"},{"b":"400000","o":"F7E0B1"},{"b":"400000","o":"F7E4C5"},{"b":"400000","o":"973F86"},{"b":"400000","o":"96D4CB"},{"b":"400000","o":"967264"},{"b":"400000","o":"967BD3"},{"b":"400000","o":"967ED9"},{"b":"400000","o":"961D32"},{"b":"400000","o":"8E21A6"},{"b":"400000","o":"8E52DC"},{"b":"400000","o":"960BA6"},{"b":"400000","o":"66F8BD"},{"b":"400000","o":"3E20C0"},{"b":"400000","o":"3E7704"},{"b":"7F8FBB05B000","o":"1FFE0"},{"b":"400000","o":"3E02C9"}],"processInfo":{ "mongodbVersion" : "3.0.1-rc0", "gitVersion" : "3f999d764cdd307f779b27e3a28230b21bd43e25", "uname" : { "sysname" : "Linux", "release" : "3.17.4-301.fc21.x86_64", "version" : "#1 SMP Thu Nov 27 19:09:10 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFF0FBFE000", "elfType" : 3 }, { "b" : "7F8FBC052000", "path" : "/lib64/libpthread.so.0", "elfType" : 3 }, { "b" : "7F8FBBE4A000", "path" : "/lib64/librt.so.1", "elfType" : 3 }, { "b" : "7F8FBBC46000", "path" : "/lib64/libdl.so.2", "elfType" : 3 }, { "b" : "7F8FBB937000", "path" : "/lib64/libstdc++.so.6", "elfType" : 3 }, { "b" : "7F8FBB62F000", "path" : "/lib64/libm.so.6", "elfType" : 3 }, { "b" : "7F8FBB418000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3 }, { "b" : "7F8FBB05B000", "path" : "/lib64/libc.so.6", "elfType" : 3 }, { "b" : "7F8FBC26E000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }} mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf4fd39] mongod(_ZN5mongo10logContextEPKc+0xE1) [0xef9f81] mongod(_ZN5mongo13fassertFailedEi+0x61) [0xeddb71] mongod(+0x978FDA) [0xd78fda] mongod(+0xF807F0) [0x13807f0] mongod(__wt_err+0x95) [0x1380ab5] mongod(__wt_panic+0x24) [0x1380f54] mongod(__wt_bm_read+0x72) [0x12d9112] mongod(__wt_bt_read+0x7E) [0x12f0ade] mongod(__wt_btree_tree_open+0x58) [0x12ee878] mongod(__wt_btree_open+0xD72) [0x12ef752] mongod(__wt_conn_btree_get+0x19B) [0x131779b] mongod(__wt_session_get_btree+0x2D8) [0x137fe68] mongod(__wt_session_get_btree_ckpt+0xCA) [0x137ff7a] mongod(__wt_curfile_open+0xE1) [0x1324cf1] mongod(__wt_open_cursor+0x26A) [0x137e26a] mongod(__wt_curtable_open+0x2E3) [0x1335e33] mongod(__wt_open_cursor+0xB1) [0x137e0b1] mongod(+0xF7E4C5) [0x137e4c5] mongod(_ZN5mongo17WiredTigerSession9getCursorERKSsmb+0x156) [0xd73f86] mongod(_ZN5mongo16WiredTigerCursorC1ERKSsmbPNS_16OperationContextE+0x4B) [0xd6d4cb] mongod(_ZN5mongo21WiredTigerRecordStore8IteratorC2ERKS0_PNS_16OperationContextERKNS_8RecordIdERKNS_20CollectionScanParams9DirectionEb+0x64) [0xd67264] mongod(_ZNK5mongo21WiredTigerRecordStore11getIteratorEPNS_16OperationContextERKNS_8RecordIdERKNS_20CollectionScanParams9DirectionE+0x43) [0xd67bd3] mongod(_ZN5mongo21WiredTigerRecordStoreC1EPNS_16OperationContextERKNS_10StringDataES5_bllPNS_28CappedDocumentDeleteCallbackEPNS_20WiredTigerSizeStorerE+0x2B9) [0xd67ed9] mongod(_ZN5mongo18WiredTigerKVEngine14getRecordStoreEPNS_16OperationContextERKNS_10StringDataES5_RKNS_17CollectionOptionsE+0x132) [0xd61d32] mongod(_ZN5mongo22KVDatabaseCatalogEntry14initCollectionEPNS_16OperationContextERKSsb+0x276) [0xce21a6] mongod(_ZN5mongo15KVStorageEngineC1EPNS_8KVEngineERKNS_22KVStorageEngineOptionsE+0x69C) [0xce52dc] mongod(+0x960BA6) [0xd60ba6] mongod(_ZN5mongo23GlobalEnvironmentMongoD22setGlobalStorageEngineERKSs+0x30D) [0xa6f8bd] mongod(_ZN5mongo13initAndListenEi+0x2F0) [0x7e20c0] mongod(main+0x134) [0x7e7704] libc.so.6(__libc_start_main+0xF0) [0x7f8fbb07afe0] mongod(+0x3E02C9) [0x7e02c9]
Note that this affects 3.0.1-rc0 and not 3.0.0, so it appears to be a regression. This scenario seems to require the validate(), so I suspect it could be related to the new functionality in SERVER-17481 relating to validate - either in WT or in the integration layer.
- is related to
-
SERVER-17481 WiredTigerRecordStore::validate should call WT_SESSION::verify
- Closed
- related to
-
SERVER-17152 WiredTiger file corrupted during power cycle test
- Closed