Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: 3.1.7
Affects Version/s: None
Component/s: Aggregation Framework, Querying
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Steps To Reproduce:

Hide

./buildscripts/resmoke.py --executor=aggregation --storageEngine=mmapv1 --log=file jstests/aggregation/testshard1.js --repeat=100

tail -f executor.log, and watch for the test to have hung. On our Solaris builders, it takes about 4 minutes to run the test, and if a test is still running after 6, it's hung.

Show
./buildscripts/resmoke.py --executor=aggregation --storageEngine=mmapv1 --log=file jstests/aggregation/testshard1.js --repeat=100 tail -f executor.log, and watch for the test to have hung. On our Solaris builders, it takes about 4 minutes to run the test, and if a test is still running after 6, it's hung.
Sprint:
QuInt 8 08/28/15
Confidence Status:
None
Work Order:
0

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

This is a deadlock that occurs when a shard is trying to receive a chunk while also acting as the merge node for an aggregation sort. There are three threads in one process involved.

Thread 1 is servicing a getmore request for an aggregation pipeline. A relevant portion of its stack frame is below:

#7  0x000000000159bf52 in mongo::MessagingPort::recv (this=0x6f5fe40, toSend=..., response=...) at src/mongo/util/net/message_port.cpp:227
#8  0x000000000159c2e5 in mongo::MessagingPort::call (this=<optimized out>, toSend=..., response=...) at src/mongo/util/net/message_port.cpp:222
#9  0x0000000000d7921f in mongo::DBClientConnection::call (this=0x3c77c20, toSend=..., response=..., assertOk=<optimized out>, 
    actualServer=<optimized out>) at src/mongo/client/dbclient.cpp:1478
#10 0x0000000000d8b7dc in mongo::DBClientCursor::requestMore (this=this@entry=0x3bcc648) at src/mongo/client/dbclientcursor.cpp:210
#11 0x0000000000d8bcc7 in mongo::DBClientCursor::more (this=0x3bcc648) at src/mongo/client/dbclientcursor.cpp:341
#12 0x00000000010adf73 in mongo::sorter::MergeIterator<mongo::Value, mongo::Document, mongo::DocumentSourceSort::Comparator>::next() ()
#13 0x00000000010aa611 in mongo::DocumentSourceSort::getNext (this=<optimized out>) at src/mongo/db/pipeline/document_source_sort.cpp:71
#14 0x0000000000f59740 in mongo::PipelineProxyStage::getNextBson (this=this@entry=0x3c75840) at src/mongo/db/exec/pipeline_proxy.cpp:131
#15 0x0000000000f5a34f in mongo::PipelineProxyStage::isEOF (this=0x3c75840) at src/mongo/db/exec/pipeline_proxy.cpp:87
#16 0x0000000001107e84 in mongo::shouldSaveCursorGetMore (finalState=<optimized out>, exec=exec@entry=0x3c75920, isTailable=<optimized out>)
    at src/mongo/db/query/find.cpp:122
#17 0x0000000001108e6d in mongo::getMore (txn=txn@entry=0x7fffe7a02bd0, ns=ns@entry=0x3c7b5e4 "aggShard.server9444", ntoreturn=ntoreturn@entry=0, 
    cursorid=cursorid@entry=35834991707, exhaust=exhaust@entry=0x7fffe7a01fbd, isCursorAuthorized=isCursorAuthorized@entry=0x7fffe7a01fbe)
    at src/mongo/db/query/find.cpp:414

We can see that thread 1 is blocked because shouldSaveGetMore has asked its pipeline to answer the predicate isEOF, and the PipelineProxy stage has turned that into a getNext(), which in turn causes the sorter to fetch data on one of its remote cursors. At this point, thread 1 holds the database lock for the source of the aggregation in MODE_IS. Notably, the "remote cursor" is actually in the same process; the stack frames near the top are engaging the network layer to communicate with this very process on another thread.

At this point, thread 2 starts the migrate thread driver run by the recipients of new chunks. The other shard in this cluster (not pictured) has started donated a new chunk to this node, coincidentally on the same database as the one the agg is running on. Thread 2 blocks trying to acquire the database lock in MODE_X, because thread 1 has it in MODE_IS. Stack trace is below.

#13 0x0000000000f00cf7 in mongo::Lock::DBLock::DBLock (this=0x7fffd7602c10, locker=0x3c7b9e0, db=..., mode=<optimized out>)
    at src/mongo/db/concurrency/d_concurrency.cpp:110
#14 0x00000000012970f1 in mongo::MigrationDestinationManager::_migrateDriver (
    this=this@entry=0x1ffbbc0 <mongo::(anonymous namespace)::migrateDestManager>, txn=txn@entry=0x7fffd7602e10, ns=..., min=..., max=..., 
    shardKeyPattern=..., fromShard=..., epoch=..., writeConcern=...) at src/mongo/db/s/migration_destination_manager.cpp:434

Now, thread 3 starts trying to process the getmore requested issued to it from thread 1. It attempts to acquire the same database lock as threads 1 and 2 in MODE_IS. However, since thread 2 is waiting in MODE_X, it prevents any new threads from acquiring in MODE_IS. We are now deadlocked. Here's thread 3's stack:

#13 0x0000000000f00cf7 in mongo::Lock::DBLock::DBLock (this=0x648b430, locker=0x3bf1e00, db=..., mode=<optimized out>)
    at src/mongo/db/concurrency/d_concurrency.cpp:110
#14 0x0000000001108e12 in make_unique<mongo::Lock::DBLock, mongo::Locker*, mongo::StringData, mongo::LockMode> ()
    at src/third_party/boost-1.56.0/boost/smart_ptr/make_unique_object.hpp:28
#15 mongo::getMore (txn=txn@entry=0x7ffff2ffbbd0, ns=ns@entry=0x5cf1084 "aggShard.server9444", ntoreturn=ntoreturn@entry=0, 
    cursorid=cursorid@entry=36214372988, exhaust=exhaust@entry=0x7ffff2ffafbd, isCursorAuthorized=isCursorAuthorized@entry=0x7ffff2ffafbe)
    at src/mongo/db/query/find.cpp:404

Relevant build failures:
https://evergreen.mongodb.com/task/mongodb_mongo_master_solaris_64_bit_aggregation_aa82208d53a4b642441adbadd4bd0595c4fa40ae_15_08_11_21_38_16

https://evergreen.mongodb.com/task/mongodb_mongo_master_solaris_64_bit_aggregation_8e2990da71c1498d2a979ef47c07c2bfc2fa4f97_15_08_11_15_05_28

https://evergreen.mongodb.com/task/mongodb_mongo_master_solaris_64_bit_aggregation_a067d03eca22d67a1ca036ea61544cd51df1490b_15_08_10_21_57_24

https://evergreen.mongodb.com/task/mongodb_mongo_master_solaris_64_bit_aggregation_55f6c3a09f96aabe7c933d9262cf011c7776fd60_15_08_10_12_56_20

is related to

SERVER-18203 Aggregation distributed deadlock with sharding