I think the "sh20014" here is supposed to be "s20014". The log might also be missing some logging: seeing a lock write being replicated on the config server secondaries, but not the original write to the config server primary.
Example of "sh20014" doing the same checks "s20014" gets stuck on previously in the logs.
[js_test:multi_coll_drop] 2016-04-06T02:54:18.252-0500 sh20014| 2016-04-06T02:53:59.679-0500 D ASIO [conn1] startCommand: RemoteCommand 1014 -- target:mongovm16:20013 db:config expDate:2016-04-06T02:54:29.679-0500 cmd:{ find: "locks", filter: { _id: "multidrop.coll" }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp 1459929237000|1, t: 8 } }, limit: 1, maxTimeMS: 30000 } [js_test:multi_coll_drop] 2016-04-06T02:54:18.255-0500 sh20014| 2016-04-06T02:53:59.679-0500 D ASIO [NetworkInterfaceASIO-ShardRegistry-0] Starting asynchronous command 1014 on host mongovm16:20013 [js_test:multi_coll_drop] 2016-04-06T02:54:18.263-0500 sh20014| 2016-04-06T02:53:59.680-0500 D ASIO [NetworkInterfaceASIO-ShardRegistry-0] Request 1014 finished with response: { waitedMS: 0, cursor: { firstBatch: [ { _id: "multidrop.coll", state: 2, ts: ObjectId('5704c06465c17830b843f1cb'), who: "mongovm16:20010:1459929128:185613966:conn5", process: "mongovm16:20010:1459929128:185613966", when: new Date(1459929188727), why: "splitting chunk [{ _id: -62.0 }, { _id: MaxKey }) in multidrop.coll" } ], id: 0, ns: "config.locks" }, ok: 1.0 } [js_test:multi_coll_drop] 2016-04-06T02:54:18.269-0500 sh20014| 2016-04-06T02:53:59.680-0500 D ASIO [conn1] startCommand: RemoteCommand 1016 -- target:mongovm16:20013 db:config expDate:2016-04-06T02:54:29.680-0500 cmd:{ find: "lockpings", filter: { _id: "mongovm16:20010:1459929128:185613966" }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp 1459929237000|1, t: 8 } }, limit: 1, maxTimeMS: 30000 } [js_test:multi_coll_drop] 2016-04-06T02:54:18.270-0500 sh20014| 2016-04-06T02:53:59.681-0500 D ASIO [NetworkInterfaceASIO-ShardRegistry-0] Starting asynchronous command 1016 on host mongovm16:20013 [js_test:multi_coll_drop] 2016-04-06T02:54:18.275-0500 sh20014| 2016-04-06T02:53:59.681-0500 D ASIO [NetworkInterfaceASIO-ShardRegistry-0] Request 1016 finished with response: { waitedMS: 0, cursor: { firstBatch: [ { _id: "mongovm16:20010:1459929128:185613966", ping: new Date(1459929228990) } ], id: 0, ns: "config.lockpings" }, ok: 1.0 } [js_test:multi_coll_drop] 2016-04-06T02:54:18.279-0500 sh20014| 2016-04-06T02:53:59.681-0500 D ASIO [conn1] startCommand: RemoteCommand 1018 -- target:mongovm16:20013 db:admin expDate:2016-04-06T02:54:29.681-0500 cmd:{ serverStatus: 1, maxTimeMS: 30000 } [js_test:multi_coll_drop] 2016-04-06T02:54:18.292-0500 sh20014| 2016-04-06T02:53:59.682-0500 D ASIO [NetworkInterfaceASIO-ShardRegistry-0] Starting asynchronous command 1018 on host mongovm16:20013
Logs are attached. The logs are coming from the new PPC machine, if that makes any difference.
Evergreen links:
sharding_csrs_continuous_config_stepdown_WT failed on enterprise-rhel-71-ppc64le
- related to
-
SERVER-22992 wait_for_pid() function in shell_utils_launcher.cpp doesn't wait for program output to finish being consumed
- Closed