Queued table drops within the WTKVEngine can compete with each other.
When creating and dropping a large number of tables (such as running multiple concurrent map/reduce jobs) MongoDB creates a large number of tables then drops them in quick succession. Given that the instance is still "hot" at this point as there are multiple create and drop operations going on at once, many of these drops get queued to be dropped later.
This queuing then triggers a close of all sessions and any number of threads will then enter the code path to drop all of the queued collections all starting at the top. Having multiple threads enter the drop code at once means that we often see several attempts to drop one collection(s) before a single drop succeeds, during which time further collection drops are often queued. This causes an almost quadratic case, where a high rate of collection creates/drops will cause more and more drops to be queued.
One thing needed to create the conditions for this is a small level of delay in processing of the deletes. I suspect this occurs on OSX as we must use the slower fsync or with strace on linux - which slows the OS calls.
- related to
-
WT-2646 Split the lock_wait flag into two, adding a checkpoint_wait flag
- Closed