-
Type: Bug
-
Resolution: Gone away
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
One of the mongoDB corruption tests runs on ASAN and is indicating a memory leak. It is one of the new salvage/repair tests. We see messages in the log like:
[js_test:wt_repair_corrupt_metadata] 2018-11-06T14:35:00.699+0000 d20021| 2018-11-06T14:35:00.699+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=1024M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress), [1541514901:406868][58538:0x7f6559b76a80], txn-recover: __wt_txn_recover, 740: Recovery failed: WT_NOTFOUND: item not found [1541514901:407502][58538:0x7f6559b76a80], connection: __wt_cache_destroy, 384: cache server: exiting with 1 pages in memory and 0 pages evicted [1541514901:407575][58538:0x7f6559b76a80], connection: __wt_cache_destroy, 389: cache server: exiting with 51 image bytes in memory [1541514901:407621][58538:0x7f6559b76a80], connection: __wt_cache_destroy, 393: cache server: exiting with 315 bytes in memory [1541514901:423924][58538:0x7f6559b76a80], txn-recover: __wt_txn_recover, 740: Recovery failed: WT_NOTFOUND: item not found [1541514901:424643][58538:0x7f6559b76a80], connection: __wt_cache_destroy, 384: cache server: exiting with 1 pages in memory and 0 pages evicted [1541514901:424699][58538:0x7f6559b76a80], connection: __wt_cache_destroy, 389: cache server: exiting with 51 image bytes in memory [1541514901:424735][58538:0x7f6559b76a80], connection: __wt_cache_destroy, 393: cache server: exiting with 315 bytes in memory [1541514901:441336][58538:0x7f6559b76a80], txn-recover: __wt_txn_recover, 740: Recovery failed: WT_NOTFOUND: item not found [1541514901:441995][58538:0x7f6559b76a80], connection: __wt_cache_destroy, 384: cache server: exiting with 1 pages in memory and 0 pages evicted [1541514901:442054][58538:0x7f6559b76a80], connection: __wt_cache_destroy, 389: cache server: exiting with 51 image bytes in memory [1541514901:442090][58538:0x7f6559b76a80], connection: __wt_cache_destroy, 393: cache server: exiting with 315 bytes in memory Failed to start up WiredTiger under any compatibility version. Reason: -31804: WT_PANIC: WiredTiger library panic Attempting to salvage WiredTiger metadata
Although the open with salvage succeeds and the system is correctly repaired and able to run, the initial error and the cache destroy messages indicate a memory leak and the ASAN then complains about that leak.
The nature of the error is that the WiredTiger.turtle file has an invalid/bad checkpoint_lsn=(1,2). When we call wt_log_scan to recover the metadata file on the first pass of recovery it detects the bad LSN and returns WT_NOTFOUND.
There are a number of things to do here:
1. Add a bad-lsn case to the test/csuite/wt4156_metadata_salvage test.
2. Find and fix the leak. This is not a panic error. We haven't actually recovered anything yet, so it isn't clear where this memory is being used. Actually it may be the metadata cursor in use - it appears the error paths don't close that.
3. Consider if wt_txn_recover:636 where we check for an error ret of ENOENT to set WT_CONN_DATA_CORRUPTION should also check for WT_NOTFOUND to cover this case.
- related to
-
SERVER-57147 Remove outdated TODO comment referring to WT-4459
- Closed
-
WT-6694 Memory leak issues when closing with PANIC set
- Closed