-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Sharding
-
Cluster Scalability
-
Fully Compatible
-
ALL
-
0
-
2
kill_pinned_cursor.js has fairly loose targeting of its getMores here and here. If there is another getMore additionally running on the system (eg. from the periodic sharded index consistency checker), then this can cause the $currentOp to return more than 1 matching getMore, making the test fail here or here.
Similarly, if there is some other internal getMore running on the system when the parallel shell is started and it has cursors pinned, then the sleep to wait for the parallel shell to startup can short-circuit. In conjunction with the above loose targeting, this can cause the killFunc to kill the internal getMore, instead of the test one. This causes the test getMore to never be interrupted, and since it is waiting on a failpoint that is only switched off after the getMore returns, this means the test is deadlocked and will time out.
Based on this kill_pinned_cursor.js should be updated to more accurately use $currentOp to find only the test getMore. eg. have the original find include a dummy query predicate (that increments for each test) such as "foo1": {$exists: false}, and then have the $currentOp only look for getMores with that query predicate in the originating command.
- is related to
-
SERVER-48502 Tighten $currentOp and pinned cursor checks in kill_pinned_cursor.js
- Closed