Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Critical - P2
Fix Version/s: 2.6.0-rc1
Affects Version/s: 2.5.5
Component/s: Write Ops
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Confidence Status:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Currently, write commands use nested curops to do their write operations. Nested curops occlude their parent in the currentOp list, so the only thing users will see is the current operation of a write command, with no parent. Killing that op will result in just that one op being flagged with an error, and the write command will continue applying the rest of the ops as it would for any other write error.

There are two problems to be solved here.

First, the write command driver needs to specially trap the "interrupted" exceptions thrown from an individual write op and pass them up to the command handler. In addition, when a write op completes, the write command driver needs to checkForInterrupt before it unwraps the nested curop, in case the killed flag is set but the write op never checked it. Otherwise, we would throw away the killed flag and never notice it was set.

Second, as the write command proceeds through its write ops, the currentOp display will spin through op ids rather quickly, which will make it very difficult for a user to kill the operation. I propose that we make modifications to the currentOp info() function such that it traces through a nested op into the parent, such that it builds a tree to display the parent op, with its associated op id, in addition to any nested ops. This would allow a user to discern the parent op id and use that in killOp to abort a long-running write command.
Unfortunately, this will change the currentOp display in a way that may break scripts that are looking for certain fields to identify specific operations.

is related to

SERVER-12762 Batch inserts should be counted and profiled individually

Closed

related to

SERVER-11432 Audit usage of checkForInterrupt(false)

Closed

Assignee:: Eric Milkie

Reporter:: Eric Milkie

Participants:: Eric Milkie, Githook User

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: Feb 04 2014 08:30:21 PM UTC

Updated:: Jul 11 2016 05:19:26 PM UTC

Resolved:: Mar 04 2014 01:57:53 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates