-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.6.5, 2.6.7, 3.0.0-rc7
-
Component/s: Concurrency, Write Ops
-
None
-
Fully Compatible
-
ALL
-
-
(copied to CRM)
With a cold start start certain 'update' commands fail when they are expected to succeed.
Scenario can be triggered by cold-starting mongod and having multiple threads/processes update the same document.
Have reproduced this on both OSX and Ubuntu using the attached scripts:
- reset.sh : starts an empty mongod and inserts 10 documents containing just an _id field
- run-js.sh: cold starts mongod and sets log level 1 (via prep.sh script). Then it starts 10 subprocesses, each one updating the same document via $addToSet.
The run-js.sh script may need to be run a few times. Each time it targets a single document and tries to blast it with multiple concurrent updates. It will cycle through all 10 documents at a rate of one a second. The fact that it typically doesn't occur during the first run might indicate the problem is exposed more easily when dealing with larger objects.
Under certain circumstances some of these updates fail; the update however should have succeeded and can be manually applied later on.
The common characteristic seems to be:
- nMatched:0 and nModified:0
- numYields:1
The nscanned metrics in each case show the document was found. However, the operation yields (numYields:1) after which the result is nMatched:0 and nModified:0, ie the update has not been applied.
These particular operations also seem to hold the 'w' lock for a lot less time than those which succeed. This is probably to be expected given the behavior but is another key indicator of the problem.
I thought this might have been due to document moves, ie this document is found but the operation yields before the update is applied; during this time another update comes in and successfully updates the document, moving it in the process, such that when the yielded operation resumes it cannot find the document. This does not appear to be borne out by the available evidence, i.e., I have seen many circumstances where there are no document moves taking place and yet the problem persists.
Note too: it is not sufficient for the update operation to yield - there are occasions where the operation does yield and the update still completes successfully.
I have also tried to force the document into memory before the update by issuing a 'findOne()' and then the 'update()' but I still see the problem.
FYI I also have a slightly more complex repro written in Java which can switch between the standard Update command, the Bulk Update API and 'findAndModify()' - both Update methods suffer this problem but 'findAndModify' does not appear to. Interestingly the 'findAndModify()' does not yield - so perhaps that's a clue. (I can upload this Java version too if required).
Log file snippet showing problem:
2015-02-04T12:01:22.971+0000 [conn9] update foo.bar query: { _id: 1, c.sb.name: { $ne: "Update::1::18::a158ac61-35d9-45c8-9412-f2733f41585c" } } update: { $addToSet: { c.sb: { name: "Update::1::18::a158ac61-35d9-45c8-9412-f2733f41585c" } } } nscanned:1 nscannedObjects:1 nMatched:0 nModified:0 keyUpdates:0 numYields:1 locks(micros) w:2099 14ms
Counter example showing an update which yields and yet succeeds:
2015-02-04T12:00:03.260+0000 [conn4] update foo.bar query: { _id: 6, c.sb.name: { $ne: "Update::6::221::d2b6ab90-cd2a-4dd5-b246-c06276d33768" } } update: { $addToSet: { c.sb: { name: "Update::6::221::d2b6ab90-cd2a-4dd5-b246-c06276d33768" } } } nscanned:1 nscannedObjects:1 nMatched:1 nModified:1 keyUpdates:0 numYields:1 locks(micros) w:4004 8ms
I have tested this on MongoDB 2.6.5, 2.6.7 and 3.0.0-rc7 - all are affected.
I have tested my java version on 2.4.12 and it does not appear to be affected, indicating this is a regression.
- duplicates
-
SERVER-14703 Snapshot queries can miss records if there are concurrent updates
- Closed
- related to
-
SERVER-12018 Multi-update can update same doc more than once
- Closed
-
SERVER-14766 Indexed queries should not miss documents where neither the queried nor indexed fields change during the life of the query.
- Closed
-
SERVER-13992 UpdateTicket should handle record moves
- Closed
-
SERVER-700 Overwrite index key rather than delete/reinsert if new value would end up in the same place
- Closed