This was discovered while running the sharding suite with continuous primary stepdown thread enabled. The applyOps command uses DBDirectClient and for this reason if stepdown happens just at the time the operation is about to start and the threads is interrupted, DBDirectClient will end up returning error 13106 instead of interruption.
Here are some excerpts from the verbose logs:
[js_test:balance_repl] 2015-12-18T18:02:13.249+0000 c20514| 2015-12-18T18:02:12.936+0000 D - [conn32] User Assertion: 11601:operation was interrupted [js_test:balance_repl] 2015-12-18T18:02:13.250+0000 c20514| 2015-12-18T18:02:12.936+0000 I QUERY [conn32] assertion 11601 operation was interrupted ns:config.chunks query:{ query: { ns: "test.foo" }, orderby: { lastmod: -1 } } [js_test:balance_repl] 2015-12-18T18:02:13.250+0000 c20514| 2015-12-18T18:02:12.936+0000 I QUERY [conn32] ntoskip:0 ntoreturn:1 [js_test:balance_repl] 2015-12-18T18:02:13.250+0000 c20514| 2015-12-18T18:02:12.936+0000 I QUERY [conn32] query config.chunks query: { query: { ns: "test.foo" }, orderby: { lastmod: -1 } } ntoreturn:1 ntoskip:0 keyUpdates:0 writeConflicts:0 exception: operation was interrupted code:11601 numYields:0 reslen:71 locks:{ Global: { acquireCount: { r: 3, W: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } 0ms [js_test:balance_repl] 2015-12-18T18:02:13.251+0000 c20514| 2015-12-18T18:02:12.936+0000 D - [conn32] User Assertion: 13106:nextSafe(): { $err: "operation was interrupted", code: 11601 } [js_test:balance_repl] 2015-12-18T18:02:13.252+0000 c20514| 2015-12-18T18:02:12.936+0000 D COMMAND [conn32] assertion while executing command 'applyOps' on database 'config' with arguments '{ applyOps: [ { op: "u", b: true, ns: "config.chunks", o: { _id: "test.foo-_id_600.0", lastmod: Timestamp 1000|15, lastmodEpoch: ObjectId('56744a23fc2e02a76c6d8248'), ns: "test.foo", min: { _id: 600.0 }, max: { _id: 700.0 }, shard: "test-rs0" }, o2: { _id: "test.foo-_id_600.0" } }, { op: "u", b: true, ns: "config.chunks", o: { _id: "test.foo-_id_700.0", lastmod: Timestamp 1000|16, lastmodEpoch: ObjectId('56744a23fc2e02a76c6d8248'), ns: "test.foo", min: { _id: 700.0 }, max: { _id: MaxKey }, shard: "test-rs0" }, o2: { _id: "test.foo-_id_700.0" } } ], preCondition: [ { ns: "config.chunks", q: { query: { ns: "test.foo" }, orderby: { lastmod: -1 } }, res: { lastmod: Timestamp 1000|14 } } ], maxTimeMS: 30000 }' and metadata '{ $replData: 1 }': 13106 nextSafe(): { $err: "operation was interrupted", code: 11601 }
Putting this ticket in the sharding bucket, because sharding is the main consumer of applyOps.
- is depended on by
-
SERVER-21050 Add a failover workload to cause CSRS config server primary failovers
- Closed