-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 3.0.0-rc10
-
Component/s: WiredTiger
-
Fully Compatible
-
ALL
WiredTiger's compact code will return an 'Operation timed out' error if the compact operation exceeds 'max_time'. This timeout can seemingly occur under two conditions:
1 - The WT compact operation takes longer than 1200 seconds (20 minutes)
2 - The system clock goes backward during the compact operation
When the timeout occurs, the error propagates up and leads to an invariant failure:
Breakpoint 1, __session_compact_check_timeout (session=0x35b4c40, begin=...) at src/third_party/wiredtiger/src/session/session_compact.c:138 (gdb) l 133 return (0); 134 135 WT_RET(__wt_epoch(session, &end)); 136 if (session->compact->max_time < 137 WT_TIMEDIFF(end, begin) / WT_BILLION) 138 WT_RET(ETIMEDOUT); 139 return (0); 140 } 141 142 /* ==> WT_TIMEDIFF(end, begin) / WT_BILLION) calculation when the clock goes backward (gdb) p (1000000000 * (uint64_t)((end).tv_sec - (begin).tv_sec) + (uint64_t)(end).tv_nsec - (uint64_t)(begin).tv_nsec) / 1000000000 $4 = 18446744010 (gdb) p session->compact->max_time $5 = 1200 (gdb) c Invariant failure: ret resulted in status UnknownError 110: Connection timed out at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp 730 (gdb) bt #0 0x00007ffff7bcc20b in raise (sig=5) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37 #1 0x00000000018f7874 in mongo::breakpoint () at src/mongo/util/debugger.cpp:63 #2 0x00000000018ed771 in mongo::invariantOKFailed (expr=0x2062531 "ret", status=..., file=0x2062420 "src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp", line=730) at src/mongo/util/assert_util.cpp:157 #3 0x000000000176f0f4 in mongo::WiredTigerRecordStore::compact (this=0x4a2cd00, txn=0x7ffff10f47d0, adaptor=0x0, options=0x7ffff10f2e40, stats=0x7ffff10f2ca0) at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp:730 #4 0x0000000001282bc6 in mongo::Collection::compact (this=0x5177800, txn=0x7ffff10f47d0, compactOptions=0x7ffff10f2e40) at src/mongo/db/catalog/collection_compact.cpp:118 #5 0x00000000012c916c in mongo::CompactCmd::run (this=0x28c0d40 <mongo::compactCmd>, txn=0x7ffff10f47d0, db=..., cmdObj=..., errmsg=..., result=..., fromRepl=false) at src/mongo/db/commands/compact.cpp:168 #6 0x000000000135b619 in mongo::_execCommand (txn=0x7ffff10f47d0, c=0x28c0d40 <mongo::compactCmd>, dbname=..., cmdObj=..., queryOptions=0, errmsg=..., result=..., fromRepl=false) at src/mongo/db/dbcommands.cpp:1316 #7 0x000000000135c5e7 in mongo::Command::execCommand (txn=0x7ffff10f47d0, c=0x28c0d40 <mongo::compactCmd>, queryOptions=0, cmdns=0x3449414 "db1.$cmd", cmdObj=..., result=..., fromRepl=false) at src/mongo/db/dbcommands.cpp:1532 #8 0x000000000135ced8 in mongo::_runCommands (txn=0x7ffff10f47d0, ns=0x3449414 "db1.$cmd", _cmdobj=..., b=..., anObjBuilder=..., fromRepl=false, queryOptions=0) at src/mongo/db/dbcommands.cpp:1604 #9 0x0000000001563e58 in mongo::runCommands (txn=0x7ffff10f47d0, ns=0x3449414 "db1.$cmd", jsobj=..., curop=..., b=..., anObjBuilder=..., fromRepl=false, queryOptions=0) at src/mongo/db/query/find.cpp:137 #10 0x0000000001565e8a in mongo::runQuery (txn=0x7ffff10f47d0, m=..., q=..., nss=..., curop=..., result=..., fromDBDirectClient=false) at src/mongo/db/query/find.cpp:606 #11 0x000000000146a7d0 in mongo::receivedQuery (txn=0x7ffff10f47d0, c=..., dbresponse=..., m=..., fromDBDirectClient=false) at src/mongo/db/instance.cpp:220 #12 0x000000000146b97d in mongo::assembleResponse (txn=0x7ffff10f47d0, m=..., dbresponse=..., remote=..., fromDBDirectClient=false) at src/mongo/db/instance.cpp:403 #13 0x000000000115d604 in mongo::MyMessageHandler::process (this=0x310a400, m=..., port=0x344f3b0, le=0x50668a0) at src/mongo/db/db.cpp:206 #14 0x0000000001915cb6 in mongo::PortMessageServer::handleIncomingMsg (arg=0x344f3b0) at src/mongo/util/net/message_server_port.cpp:229 #15 0x00007ffff7bc4182 in start_thread (arg=0x7ffff10f5700) at pthread_create.c:312 #16 0x00007ffff6cc500d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111