Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Critical - P2
Fix Version/s: None
Affects Version/s: 2.4.10
Component/s: MapReduce
Labels:
- map_reduce
- mapreduce

Operating System:
ALL
Steps To Reproduce:

Hide

Appears to be:

Start longish running indexed map/reduce job(s)
Delete all records while the job is still running

Show
Appears to be: Start longish running indexed map/reduce job(s) Delete all records while the job is still running
Confidence Status:
None
Work Order:
0

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Because of a race condition in my code, I can delete a set of docs over which I am in the middle of running a map/reduce job (precise details below, I don't believe they are relevant). The map/reduce query is indexed.

My expectation is that the map/reduce job would just "ignore" records that were deleted.

Instead something odder seems to happen - the jobs last for way longer than they should and we see performance degradation

For example, here's a db.currentOp:

                {
                        "opid" : "replica_set1:406504931",
                        "active" : true,
                        "secs_running" : 560,
                        "op" : "query",
                        "ns" : "doc_metadata.metadata",
                        "query" : {
                                "$msg" : "query not recording (too large)"
                        },
                        "client_s" : "10.10.90.42:41453",
                        "desc" : "conn881377",
                        "threadId" : "0x6790e940",
                        "connectionId" : 881377,
                        "locks" : {
                                "^doc_metadata" : "R"
                        },
                        "waitingForLock" : true,
                        "msg" : "m/r: (1/3) emit phase M/R: (1/3) Emit Progress: 2899/1 289900%",
                        "progress" : {
                                "done" : 2899,
                                "total" : 1
                        },

Check out the Emit Progress...

There were a few of these, they all ran for 20 minutes or so (the number of docs being deleted was small - in the few thousand range), before eventually cleaning themselves up.

Bonus worry: I have a similar case in which I run a map/reduce over the entire collection (several 10s of millions of documents), to which documents
are continually being added or removed - should I worry, or is this an edge case that happens when a high % of the query set is removed....

(Details:
Thread 1: 1a) update a bunch of docs to have field:DELETE_ME
Thread 1: 2a) run a map/reduce job to count some of their attributes prior to deletion
Thread 2: 1b) update a bunch more docs to have field:DELETE_ME
Thread 2: 2b) run a map/reduce job to count some of their attributes prior to deletion
Thread 1: 3a) Remove all docs with field:DELETE ME
Thread 2: 3b) Remove all docs with field:DELETE ME
)

Assignee:: Ramon Fernandez Marina
Reporter:: Alex Piggott
Participants:: Alex Piggott, Ramon Fernandez Marina
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Sep 19 2014 05:08:31 PM UTC
Updated:: Apr 01 2015 07:17:48 PM UTC
Resolved:: Apr 01 2015 07:13:51 PM UTC

Details

Description

Attachments

Activity

People

Dates