-
Type: Build Failure
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: DHandles
-
Storage Engines
-
8
-
2023-06-27 Lord of the Sprints
-
v7.0, v6.0
This ticket was created to investigate the problem seen inside BF-29047. It seems that we are hit a heap-use-after-free problem
#0 0x00007fa178bec93f in raise () from /lib64/libc.so.6 #1 0x00007fa178bd6c95 in abort () from /lib64/libc.so.6 #2 0x000055b5929649c7 in __sanitizer::Abort() () at /data/mci/4c5523d6b930f0c1f82f5452d6add3b6/toolchain-builder/tmp/build-llvm-v4.sh-FAX/llvm-project-llvmorg/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp:151 #3 0x000055b592962ef1 in __sanitizer::Die() () at /data/mci/4c5523d6b930f0c1f82f5452d6add3b6/toolchain-builder/tmp/build-llvm-v4.sh-FAX/llvm-project-llvmorg/compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:58 #4 0x000055b59294a414 in ~ScopedInErrorReport () at /data/mci/4c5523d6b930f0c1f82f5452d6add3b6/toolchain-builder/tmp/build-llvm-v4.sh-FAX/llvm-project-llvmorg/compiler-rt/lib/asan/asan_report.cpp:190 #5 0x000055b59294bfda in ReportGenericError () at /data/mci/4c5523d6b930f0c1f82f5452d6add3b6/toolchain-builder/tmp/build-llvm-v4.sh-FAX/llvm-project-llvmorg/compiler-rt/lib/asan/asan_report.cpp:478 #6 0x000055b59294cbdb in __asan_report_store8 () at /data/mci/4c5523d6b930f0c1f82f5452d6add3b6/toolchain-builder/tmp/build-llvm-v4.sh-FAX/llvm-project-llvmorg/compiler-rt/lib/asan/asan_rtl.cpp:126 #7 0x000055b5975e6ab5 in __wt_calloc (session=<optimized out>, number=<optimized out>, size=<optimized out>, retp=0x61c0005263f8) at src/third_party/wiredtiger/src/os_common/os_alloc.c:64 #8 0x000055b597297ee0 in __wt_block_extlist_init (session=0x7fa16b843fe0, el=0x61c0005263f8, name=0x55b58f79b260 <str> "live", extname=0x55b58f79b740 <str> "ckpt_avail", track_size=<optimized out>) at src/third_party/wiredtiger/src/block/block_ext.c:1299 #9 0x000055b59728286c in __block_extlist_setup (session=0x7fa16b843fe0, ci=0x61c000526148, name=<optimized out>) at src/third_party/wiredtiger/src/block/block_ckpt.c:24 #10 __wt_block_ckpt_init (session=0x7fa16b843fe0, ci=0x61c000526148, name=<optimized out>) at src/third_party/wiredtiger/src/block/block_ckpt.c:55 #11 __wt_block_checkpoint_load (session=<optimized out>, block=<optimized out>, addr=<optimized out>, addr_size=<optimized out>, root_addr=<optimized out>, root_addr_sizep=0x7fa142817f10, checkpoint=<optimized out>) at src/third_party/wiredtiger/src/block/block_ckpt.c:104 #12 0x000055b59727799e in __bm_checkpoint_load (bm=0x6130005ff500, session=0x7fa16b843fe0, addr=0x0, addr_size=140331492231487, root_addr=0x0, root_addr_sizep=0x7fa1428126f0, checkpoint=false) at src/third_party/wiredtiger/src/block_cache/block_mgr.c:171 #13 0x000055b597267aba in __wt_btree_open (session=<optimized out>, op_cfg=0x603000000000) at src/third_party/wiredtiger/src/btree/bt_handle.c:144 #14 0x000055b59725eca4 in __wt_conn_dhandle_open (session=<optimized out>, cfg=<optimized out>, flags=<optimized out>) at src/third_party/wiredtiger/src/conn/conn_dhandle.c:553 #15 0x000055b597825205 in __wt_session_get_dhandle (session=<optimized out>, uri=<optimized out>, checkpoint=<optimized out>, cfg=<optimized out>, flags=<optimized out>) at src/third_party/wiredtiger/src/session/session_dhandle.c:897 #16 0x000055b5978255d7 in __wt_session_get_dhandle (session=<optimized out>, uri=<optimized out>, checkpoint=<optimized out>, cfg=<optimized out>, flags=<optimized out>) at src/third_party/wiredtiger/src/session/session_dhandle.c:890 #17 0x000055b597823afb in __wt_session_get_btree_ckpt (session=<optimized out>, uri=<optimized out>, cfg=<optimized out>, flags=<optimized out>, hs_dhandlep=<optimized out>, ckpt_snapshot=<optimized out>) at src/third_party/wiredtiger/src/session/session_dhandle.c:406 #18 0x000055b597453d35 in __curstat_file_init (session=0x7fa16b843fe0, uri=0x634002b2080b "file:collection-21-9003058746915695187.wt", cfg=0x6030044880a0, cst=0x62200019c900) at src/third_party/wiredtiger/src/cursor/cur_stat.c:411 #19 __wt_curstat_init (session=<optimized out>, uri=<optimized out>, curjoin=<optimized out>, cfg=<optimized out>, cst=<optimized out>) at src/third_party/wiredtiger/src/cursor/cur_stat.c:580 #20 0x000055b59771b24d in __wt_curstat_colgroup_init (session=<optimized out>, uri=<optimized out>, cfg=<optimized out>, cst=<optimized out>) at src/third_party/wiredtiger/src/schema/schema_stat.c:27 #21 0x000055b597453964 in __wt_curstat_init (session=<optimized out>, uri=0x6310005b4800 "statistics:colgroup:collection-21-9003058746915695187", curjoin=<optimized out>, cfg=<optimized out>, cst=<optimized out>) at src/third_party/wiredtiger/src/cursor/cur_stat.c:578 #22 0x000055b597455034 in __wt_curstat_open (session=<optimized out>, uri=<optimized out>, other=<optimized out>, cfg=<optimized out>, cursorp=<optimized out>) at src/third_party/wiredtiger/src/cursor/cur_stat.c:737 #23 0x000055b59771bc06 in __wt_curstat_table_init (session=<optimized out>, uri=<optimized out>, cfg=<optimized out>, cst=<optimized out>) at src/third_party/wiredtiger/src/schema/schema_stat.c:159 #24 0x000055b597453ebb in __wt_curstat_init (session=<optimized out>, uri=0x606000adfd80 "statistics:table:collection-21-9003058746915695187", curjoin=<optimized out>, cfg=<optimized out>, cst=<optimized out>) at src/third_party/wiredtiger/src/cursor/cur_stat.c:586 #25 0x000055b597455034 in __wt_curstat_open (session=<optimized out>, uri=<optimized out>, other=<optimized out>, cfg=<optimized out>, cursorp=<optimized out>) at src/third_party/wiredtiger/src/cursor/cur_stat.c:737 #26 0x000055b597752b4a in __session_open_cursor_int (session=<optimized out>, uri=<optimized out>, owner=<optimized out>, other=<optimized out>, cfg=<optimized out>, hash_value=<optimized out>, cursorp=<optimized out>) at src/third_party/wiredtiger/src/session/session_api.c:600 #27 0x000055b59775630b in __session_open_cursor (wt_session=0x7fa16b843fe0, uri=0x606000adfd80 "statistics:table:collection-21-9003058746915695187", to_dup=0x0, config=0x603004488160 "statistics=(fast)", cursorp=<optimized out>) at src/third_party/wiredtiger/src/session/session_api.c:737 #28 0x000055b5971cbdd5 in mongo::WiredTigerUtil::getStatisticsValue (session=0x7fa16b843fe0, uri=..., config=..., statisticsKey=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp:501 #29 0x000055b5971daffc in mongo::WiredTigerUtil::getIdentReuseSize (s=<optimized out>, uri=...) at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp:580 #30 0x000055b599cfd58e in mongo::(anonymous namespace)::_appendRecordStore (opCtx=0x6160000b7980, collection=..., verbose=<optimized out>, scale=<optimized out>, numericOnly=false, result=0x7fa142823480) at src/mongo/db/stats/storage_stats.cpp:179 #31 mongo::appendCollectionStorageStats (opCtx=0x6160000b7980, nss=..., storageStatsSpec=..., result=<optimized out>, filterObj=...) at src/mongo/db/stats/storage_stats.cpp:399 #32 0x000055b597c262d3 in mongo::(anonymous namespace)::CmdCollStats::runWithRequestParser (this=<optimized out>, opCtx=<optimized out>, cmdObj=..., requestParser=..., result=...) at src/mongo/db/commands/dbcommands.cpp:461 #33 0x000055b597c24ec0 in mongo::BasicCommandWithRequestParser<mongo::(anonymous namespace)::CmdCollStats>::runWithReplyBuilder (this=<optimized out>, opCtx=<optimized out>, dbName=..., cmdObj=..., replyBuilder=<optimized out>) at src/mongo/db/commands.h:1082 #34 0x000055b59d16db74 in mongo::BasicCommandWithReplyBuilderInterface::Invocation::run (this=<optimized out>, opCtx=0x6160000b7980, result=<optimized out>) at src/mongo/db/commands.cpp:915 #35 0x000055b59d14290c in mongo::CommandHelpers::runCommandInvocation (opCtx=0x6160000b7980, request=..., invocation=0x611000ca0ac0, response=0x607001070750) at src/mongo/db/commands.cpp:186 #36 mongo::CommandHelpers::runCommandInvocation(std::shared_ptr<mongo::RequestExecutionContext>, std::shared_ptr<mongo::CommandInvocation>, bool)::$_1::operator()() const (this=<optimized out>) at src/mongo/db/commands.cpp:171 #37 mongo::makeReadyFutureWith<mongo::CommandHelpers::runCommandInvocation(std::shared_ptr<mongo::RequestExecutionContext>, std::shared_ptr<mongo::CommandInvocation>, bool)::$_1>(mongo::CommandHelpers::runCommandInvocation(std::shared_ptr<mongo::RequestExecutionContext>, std::shared_ptr<mongo::CommandInvocation>, bool)::$_1&&) (func=...) at src/mongo/util/future.h:1348 #38 mongo::CommandHelpers::runCommandInvocation (rec=..., invocation=..., useDedicatedThread=<optimized out>) at src/mongo/db/commands.cpp:170 #39 0x000055b592abd0ec in mongo::(anonymous namespace)::runCommandInvocation (rec=std::shared_ptr<mongo::RequestExecutionContext> (empty) = {...}, invocation=std::shared_ptr<mongo::CommandInvocation> (empty) = {...}) at src/mongo/db/service_entry_point_common.cpp:159 #40 0x000055b592aade1f in mongo::(anonymous namespace)::InvokeCommand::run()::$_2::operator()() const (this=<optimized out>) at src/mongo/db/service_entry_point_common.cpp:883
The origin of the test has been recently become re-enabled by MongoDB. Following the comment inside the ticket BF-29036
Disable cursor caching in WiredTiger, and sets the cache size to "1" in MongoDB. This forces all resources to be released when done. Lower the time after all references to a file in WiredTiger have been released before it is closed. Lower the interval at which WiredTiger checks for file handles to close. Lower the number of files open before WiredTiger starts looking for cursors to close. At least 1 file should always be open, so cursor sweeps will always run when scheduled.
Following the behaviour of the test, it seems that the sweep server has been aggressively been more tested in WiredTiger in this ticket. It is worth investigating whether this BF failure and how the sweep server interacts with the opening of a dhandle.
- is depended on by
-
SERVER-78135 enable concurrency_simultaneous_replication_wiredtiger_cursor_sweeps on macos-arm64
- Closed
- is duplicated by
-
WT-11055 Sanitizer failures in block_close from sweep, accessing block already freed by drop
- Closed
-
WT-9569 Sweep server closes handle during checkpoint, asserts in block manager
- Closed
-
WT-11133 Correctly drop dhandles to avoid use-after-free error
- Closed
- is related to
-
WT-10734 Move cache of block handles to block manager layer
- Closed
- related to
-
WT-11818 Potential corruption of block list
- Closed