-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 3.0.0-rc8
-
Component/s: WiredTiger
-
None
-
Fully Compatible
-
ALL
-
0
If a crash occurs after a table is created but before it has been checkpointed, irrecoverable data loss may occur: after reboot the file may be found to be 4 KB in length but may contain invalid data (for example, all 0s):
$ hexdump collection-2-1933547346719198530.wt 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0001000
However the log may contain references to that file, causing the following irrecoverable error during recovery:
2015-02-06T13:53:10.673-0500 W STORAGE [initandlisten] Recovering data from the last clean checkpoint. 2015-02-06T13:53:10.673-0500 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=5G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0), 2015-02-06T13:53:12.063-0500 E STORAGE [initandlisten] WiredTiger (-31802) [1423248792:63060][1543:0x7f317f08bbc0], file:collection-2-1933547346719198530.wt: collection-2-1933547346719198530.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error 2015-02-06T13:53:12.063-0500 E STORAGE [initandlisten] WiredTiger (-31802) [1423248792:63221][1543:0x7f317f08bbc0], file:collection-2-1933547346719198530.wt: Operation failed during recovery: WT_ERROR: non-specific WiredTiger error 2015-02-06T13:53:12.077-0500 I - [initandlisten] Assertion: 28595:-31802: WT_ERROR: non-specific WiredTiger error 2015-02-06T13:53:12.080-0500 I STORAGE [initandlisten] exception in initAndListen: 28595 -31802: WT_ERROR: non-specific WiredTiger error, terminating
strace shows the apparent reason why: when creating the file we write the 4 KB block but do not fsync it:
1560 open("db/collection-2-2119903794654001319.wt", O_RDWR|O_CREAT|O_EXCL|O_NOATIME, 0666) = 21 1560 pwrite(21, "A\330\1\0\1\0\0\0\330\10#\267\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 0) = 4096 1560 close(21) = 0
This can result in the problematic state if a crash occurs after log entries for that file are written but before the file data is flushed to disk.
I believe that fdatasync'ing the file after this first write and before any journal entries referencing the file are written should fix this issue.
- is related to
-
SERVER-17152 WiredTiger file corrupted during power cycle test
- Closed
- related to
-
SERVER-17451 WiredTiger unable to start if crash leaves 0-length journal file
- Closed