-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Compaction
-
Storage Engines
-
8
-
Asparagus-StorEng - 2023-10-31, c(3x10^8)-StorEng - 2023-11-14
A block corruption was found when running compact.js with the simulate_crash_concurrency_replication test suite. This requires further investigation as to what caused the corruption.
Error
[SimulateCrash:job0] | 2023-10-12T07:53:14.758+00:00 E WT 22435 [initandlisten] "WiredTiger error message","attr":{"error":0,"message":{"ts_sec":1697097194,"ts_usec":758649,"thread":"18170:0x7f55c07bb240","session_dhandle_name":"file:collection-16-4528025945388683402.wt","session_name":"WT_CURSOR.next","category":"WT_VERB_DEFAULT","category_id":10,"verbose_level":"ERROR","verbose_level_id":-3,"msg":"__bm_corrupt_dump:82:{0: 2237505536, 1052672, 0xc1eff022}: (chunk 1 of 1028): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "}}
Backtrace
Thread 1 (Thread 0x7ff22b698240 (LWP 32183)): #0 0x00007ff23e00d93f in raise () from /lib64/libc.so.6 #1 0x00007ff23dff7c95 in abort () from /lib64/libc.so.6 #2 0x00007ff240565007 in mongo::(anonymous namespace)::callAbort () at src/mongo/util/assert_util.cpp:71 #3 0x00007ff240566215 in mongo::fassertFailedWithLocation (msgid=<optimized out>, msgid@entry=50853, file=<optimized out>, file@entry=0x7ff233586bf0 "src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp", line=<optimized out>, line@entry=768) at src/mongo/util/assert_util.cpp:200 #4 0x00007ff23363a61d in mongo::fassertWithLocation (line=768, file=0x7ff233586bf0 "src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp", testOK=false, msgid=50853) at src/mongo/util/assert_util.h:357 #5 mongo::(anonymous namespace)::mdb_handle_error_with_startup_suppression (handler=0x55f03107d680, session=<optimized out>, errorCode=-31804, message=0x7ffcddfd81e0 "{\"ts_sec\":1697096028,\"ts_usec\":828609,\"thread\":\"32183:0x7ff22b698240\",\"session_dhandle_name\":\"file:collection-14-8646122989813992675.wt\",\"session_name\":\"WT_CURSOR.next\",\"category\":\"WT_VERB_DEFAULT\",\"c"...) at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp:768 #6 0x00007ff22d5dd8f7 in __eventv (session=session@entry=0x55f031668780, is_json=<optimized out>, error=error@entry=-31804, func=func@entry=0x7ff22d3e5f80 <__PRETTY_FUNCTION__.0> "__wt_block_read_off", line=line@entry=253, category=category@entry=WT_VERB_DEFAULT, level=WT_VERBOSE_ERROR, fmt=0x7ff22d3dd1c8 "the process must exit and restart", ap=0x7ffcddfd9260) at src/third_party/wiredtiger/src/support/err.c:450 #7 0x00007ff22d5de515 in __wt_panic_func (session=session@entry=0x55f031668780, error=error@entry=-31802, func=func@entry=0x7ff22d3e5f80 <__PRETTY_FUNCTION__.0> "__wt_block_read_off", line=line@entry=253, category=category@entry=WT_VERB_DEFAULT, fmt=fmt@entry=0x7ff22d3e41f2 "%s: fatal read error") at src/third_party/wiredtiger/src/support/err.c:552 #8 0x00007ff22d426344 in __wt_block_read_off (session=session@entry=0x55f031668780, block=0x55f0363fb000, buf=buf@entry=0x7ffcddfd95a0, objectid=0, offset=2355208192, size=1052672, checksum=1223156613) at src/third_party/wiredtiger/src/block/block_read.c:253 #9 0x00007ff22d425fcc in __wt_bm_read (bm=0x55f031618a80, session=0x55f031668780, buf=0x7ffcddfd95a0, addr=<optimized out>, addr_size=<optimized out>) at src/third_party/wiredtiger/src/block/block_read.c:45 #10 0x00007ff22d42f780 in __wt_blkcache_read (session=session@entry=0x55f031668780, buf=buf@entry=0x7ffcddfd95a0, addr=0x7ffcddfd9651 "\343\b\245\331\300\301\344H\347\303E\362\177", addr_size=11) at src/third_party/wiredtiger/src/block_cache/block_io.c:111 #11 0x00007ff22d465031 in __page_read (session=session@entry=0x55f031668780, ref=ref@entry=0x55f03d62e040, flags=flags@entry=1684) at src/third_party/wiredtiger/src/btree/bt_read.c:183 #12 0x00007ff22d46668e in __wt_page_in_func (session=session@entry=0x55f031668780, ref=ref@entry=0x55f03d62e040, flags=flags@entry=1684) at src/third_party/wiredtiger/src/btree/bt_read.c:314 #13 0x00007ff22d48f778 in __wt_page_swap_func (session=session@entry=0x55f031668780, held=held@entry=0x0, want=want@entry=0x55f03d62e040, flags=1684) at src/third_party/wiredtiger/src/include/btree_inline.h:2201 #14 0x00007ff22d48fb81 in __wt_page_swap_func (flags=1684, want=<optimized out>, held=0x0, session=0x55f031668780) at src/third_party/wiredtiger/src/btree/bt_walk.c:467 #15 __tree_walk_internal (session=session@entry=0x55f031668780, refp=0x55f036430ae0, walkcntp=walkcntp@entry=0x0, skip_func=skip_func@entry=0x7ff22d43af10 <__wt_btcur_skip_page>, func_cookie=func_cookie@entry=0x0, flags=1552, flags@entry=1040) at src/third_party/wiredtiger/src/btree/bt_walk.c:467 #16 0x00007ff22d490210 in __wt_tree_walk_custom_skip (session=session@entry=0x55f031668780, refp=<optimized out>, skip_func=skip_func@entry=0x7ff22d43af10 <__wt_btcur_skip_page>, func_cookie=func_cookie@entry=0x0, flags=flags@entry=1040) at src/third_party/wiredtiger/src/btree/bt_walk.c:560 #17 0x00007ff22d43c0b4 in __wt_btcur_next (cbt=cbt@entry=0x55f036430900, truncating=truncating@entry=false) at src/third_party/wiredtiger/src/btree/bt_curnext.c:925 #18 0x00007ff22d4db98a in __curfile_next (cursor=0x55f036430900) at src/third_party/wiredtiger/src/cursor/cur_file.c:188 #19 0x00007ff23360994d in operator() (__closure=<optimized out>, __closure=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp:2051 #20 operator() (__closure=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp:2051 #21 mongo::wiredTigerPrepareConflictRetry<mongo::WiredTigerRecordStoreCursorBase::next()::<lambda()> > (f=..., opCtx=0x55f0310758c0) at src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.h:76 #22 mongo::WiredTigerRecordStoreCursorBase::next (this=0x55f0367fe480) at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp:2050 #23 0x00007ff233615b2d in mongo::UnyieldableCollectionIterator::getNext (this=<optimized out>) at src/mongo/db/storage/collection_truncate_markers.h:416 #24 0x00007ff23870a3c3 in mongo::CollectionTruncateMarkers::createMarkersByScanning(mongo::OperationContext*, mongo::CollectionTruncateMarkers::CollectionIterator&, mongo::NamespaceString const&, long, std::function<mongo::CollectionTruncateMarkers::RecordIdAndWallTime (mongo::Record const&)>) (opCtx=opCtx@entry=0x55f0310758c0, collectionIterator=..., ns="local.oplog.rs", minBytesPerMarker=minBytesPerMarker@entry=17043521, getRecordIdAndWallTime=...) at src/mongo/db/storage/collection_truncate_markers.cpp:195 #25 0x00007ff23870cb25 in mongo::CollectionTruncateMarkers::createFromCollectionIterator(mongo::OperationContext*, mongo::CollectionTruncateMarkers::CollectionIterator&, mongo::NamespaceString const&, long, std::function<mongo::CollectionTruncateMarkers::RecordIdAndWallTime (mongo::Record const&)>, boost::optional<long>) (opCtx=opCtx@entry=0x55f0310758c0, collectionIterator=..., ns="local.oplog.rs", minBytesPerMarker=minBytesPerMarker@entry=17043521, getRecordIdAndWallTime=..., numberOfMarkersToKeepForOplog=...) at src/mongo/db/storage/collection_truncate_markers.cpp:456 #26 0x00007ff23360fa11 in mongo::WiredTigerRecordStore::OplogTruncateMarkers::createOplogTruncateMarkers (opCtx=opCtx@entry=0x55f0310758c0, rs=rs@entry=0x55f0367fe000, ns="local.oplog.rs") at /opt/mongodbtoolchain/revisions/69f4f0673ffcb290ce2307560a4883ecf2ad138c/stow/gcc-v4.35T/include/c++/11.3.0/bits/std_function.h:211 #27 0x00007ff23360fef8 in mongo::WiredTigerRecordStore::postConstructorInit (this=0x55f0367fe000, opCtx=0x55f0310758c0, ns=...) at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp:750 #28 0x00007ff2335d8f3a in mongo::WiredTigerKVEngine::getRecordStore (this=0x55f03107d600, opCtx=0x55f0310758c0, nss="local.oplog.rs", ident=..., options=...) at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp:1604 #29 0x00007ff22fedf8ce in mongo::StorageEngineImpl::_initCollection (this=this@entry=0x55f031618d80, opCtx=<optimized out>, opCtx@entry=0x55f0310758c0, catalogId=Traceback (most recent call last): File "buildscripts/gdb/mongo_printers.py", line 284, in to_string long_id = int(self.val['_data']['longId']['id']) gdb.error: There is no member or method named longId. , nss="local.oplog.rs", forRepair=<optimized out>, minVisibleTs=..., minVisibleTs@entry=..., minValidTs=...) at src/mongo/base/string_data.h:65 #30 0x00007ff22fee842d in mongo::StorageEngineImpl::loadCatalog (this=this@entry=0x55f031618d80, opCtx=opCtx@entry=0x55f0310758c0, stableTs=..., lastShutdownState=lastShutdownState@entry=mongo::StorageEngine::LastShutdownState::kUnclean) at src/mongo/db/storage/storage_engine_impl.cpp:427 #31 0x00007ff22feea151 in mongo::StorageEngineImpl::StorageEngineImpl (this=this@entry=0x55f031618d80, opCtx=opCtx@entry=0x55f0310758c0, engine=std::unique_ptr<mongo::KVEngine> = {...}, options=...) at src/mongo/db/storage/storage_engine_impl.cpp:164 #32 0x00007ff22ff18c72 in std::make_unique<mongo::StorageEngineImpl, mongo::OperationContext*&, std::unique_ptr<mongo::WiredTigerKVEngine, std::default_delete<mongo::WiredTigerKVEngine> >, mongo::StorageEngineOptions&> () at /opt/mongodbtoolchain/revisions/69f4f0673ffcb290ce2307560a4883ecf2ad138c/stow/gcc-v4.35T/include/c++/11.3.0/bits/unique_ptr.h:962 #33 mongo::(anonymous namespace)::WiredTigerFactory::create (this=<optimized out>, opCtx=0x55f0310758c0, params=..., lockFile=0x55f031505618) at src/mongo/db/storage/wiredtiger/wiredtiger_init.cpp:167 #34 0x00007ff2362573de in mongo::initializeStorageEngine (opCtx=0x55f0310758c0, initFlags=initFlags@entry=0) at src/mongo/db/storage/storage_engine_init.cpp:229 #35 0x00007ff23f937b92 in mongo::(anonymous namespace)::_initAndListen (serviceContext=serviceContext@entry=0x55f03130cb40, listenPort=<optimized out>) at /opt/mongodbtoolchain/revisions/69f4f0673ffcb290ce2307560a4883ecf2ad138c/stow/gcc-v4.35T/include/c++/11.3.0/bits/unique_ptr.h:173 #36 0x00007ff23f93a8be in mongo::(anonymous namespace)::initAndListen (listenPort=<optimized out>, service=0x55f03130cb40) at src/mongo/db/mongod_main.cpp:1057 #37 mongo::mongod_main (argc=9, argv=0x7ffcddfdc018) at src/mongo/db/mongod_main.cpp:1885 #38 0x000055f02ea4ee95 in main (argc=<optimized out>, argv=<optimized out>) at src/mongo/db/mongod.cpp:45
The Workload
The compact.js workload has 4 states:
- Insert large documents into a collection
- Remove 1/3 of the documents
- Compact
- Query some db stats as a sanity check
The Test
This test sends a SIGSTOP signal to the server, copies the data files and sends a SIGCONT signal. On the side we startup the server on these copied data files which simulates an unclean shutdown and we run validation.
- is caused by
-
SERVER-82781 Simulate crash test hook may leave behind part of file when copying data
- Closed
-
SERVER-60728 Improved MDB crash recovery testing
- Closed
- related to
-
SERVER-80120 Add the new compaction "freeSpaceTargetMB" option to the compact API
- Closed
-
WT-11820 Create a test that takes a backup during background compaction
- Closed
-
WT-11851 Create a test where WiredTiger crashes during background compaction
- Closed