-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Storage Execution
-
ALL
We have several commands that operate on more than one collection and therefore have to take more than one collection lock at the same time. It's important that these commands take collection locks in the same order, otherwise the server can deadlock.
For example, cloneCollectionAsCapped takes collection locks in the order "fromNs" and then "toNs" -
AutoGetCollection autoColl(opCtx, fromNs, MODE_X); Lock::CollectionLock collLock(opCtx, toNs, MODE_X);
This means that if a concurrent cloneCollectionAsCapped command runs with arguments in the reverse order, it can cause the server to deadlock (the two cloneCollectionAsCapped cmds will deadlock with each other - I verified this locally).
Typically we sort collection locks to make sure that they're always acquired in the same order. Unfortunately it doesn't seem like we always sort on the same property - for example renameCollection sorts the collections on ResourceID before acquiring locks, to prevent deadlocks with itself. But this means it may deadlock with cloneCollectionAsCapped (haven't verified this).
And again this sort is different from what the ShardingDDLCoordinator uses, which sorts on the namespace string rather than ResourceID.
There may be other sorting styles being used as well. We should ensure that we sort the same way, maybe by providing a Storage Execution API for grabbing multiple collection locks.
- related to
-
SERVER-89732 Create multi-client fuzzer
- Closed