-
Type: Task
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Environment:We found this bug in WiredTier 11.3.0 version
-
Storage Engines
-
StorEng - 2024-12-10
Bug Report: Wiredtiger-bug-0
Hi, WiredTiger developers and maintainers. We found a crash consistency issue while testing the WiredTiger storage engine using our crash consistency bug detecting tool under development.
To recover the information of WiredTiger.wt file from possible crashes, WiredTiger will create wiretiger.turtle.set file to retain the most recent checkpoint information. However, if WiredTiger crashes after opening the wiretiger.turtle.set, the system will report an error fatal turtle file read error: WT_TRY_SALVAGE: database corruption detected while panicking the next time trying to open the database.
Expected Behavior
If such a crash occurs, erase the original
weretiger.turtle.set
file, and no error should occur when reopening the database
.
Steps to reproduce
- Git clone our forked version of WiredTiger where we developed a customized workload: https://github.com/efeslab/wiredtiger/tree/jiexiao([https://github.com/efeslab/wiredtiger/tree/jiexiao])
- Use GDB to run the test file test/csuite/random_abort, we modify the main.c file in random_abort to reproduce the bug
- Set breakpoint at open function
(gdb) break os_fhandle.c:281 (gdb) r -h /home/cc/test_suite/wiredtiger/build/test/csuite/random_abort -s
workload -T 1 -o 5000
- Run the program and continue until WiredTiger.turtle.set file is opened
Breakpoint 1, __wt_open (session=session@entry=0x7ffff78ce010, name=name@entry=0x7ffff7e45827 "WiredTiger.turtle.set", file_type=file_type@entry=WT_FS_OPEN_FILE_TYPE_REGULAR, flags=flags@entry=36, fhp=fhp@entry=0x7fffffffa888) at /home/cc/test_suite/wiredtiger/src/os_common/os_fhandle.c:281 281 WT_ERR(file_system->fs_open_file(
- Quit the program after open() and open the database again
- In our modified random_abort/main.c file, we call ./test_random_abort -h /home/cc/test_suite/wiredtiger/build/test/csuite/random_abort -T 1 -s checker
- The program exits with an error status containing the following error message:
[1731105005:146307][918326:0x7f13a28a0000], wiredtiger_open: [WT_VERB_METADATA][NOTICE]: WiredTiger.turtle not found, WiredTiger.turtle.set renamed to WiredTiger.turtle [1731105005:147775][918326:0x7f13a28a0000], connection: [WT_VERB_DEFAULT][ERROR]: __wti_turtle_read, 672: WiredTiger.turtle: fatal turtle file read error: WT_TRY_SALVAGE: database corruption detected [1731105005:147785][918326:0x7f13a28a0000], connection: [WT_VERB_DEFAULT][ERROR]: __wti_turtle_read, 672: the process must exit and restart: WT_PANIC: WiredTiger library panic [1731105005:147789][918326:0x7f13a28a0000], connection: [WT_VERB_DEFAULT][ERROR]: __wt_abort, 28: aborting WiredTiger library Aborted (core dumped)
Linux Distribution
Ubuntu 22.04.4 LTS
File Systems
ext4
Other System Details
WiredTiger 11.3.0