-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.4.14
-
Component/s: Storage
-
None
-
ALL
-
Storage Engines 2019-02-25
-
1
Hi, sorry but we've just had another occurrence today (still running 3.4.13) so there's still an issue here. We've modified our code to drop collection to sleep 10 sec between each deletion (to give mongo some time to recover after the "short" global lock and not kill the platform) but unfortunately this wasn't enough and it killed the global performance:
After investigation I found that this was cause by some collection deletion. I tried to upload the diagnostic.data but the portal specified earlier doesn't accept files any more. I can upload it if you give another portal.
Here is the log from the drop queries: mongo_drop_log.txt, we can see here that they are spaced by 10sec (+drop duration) and that the drop take A LOT of time (all these collections were empty or had 5 records at most). They had some indexes though, which are not shown here but probably had to be destroyed at the same time. I don't know if it's a checkpoint global lock issue again but it's definitely still not possible to drop collection in a big 3.4.13 mongo without killing it. For the record we have ~40k namespaces, this has not changed much since the db.stats I reported above.
And before you say this is probably fixed in a more recent version, we'll need better proof than last time considering the high risk of upgrading...
- is related to
-
SERVER-27700 WT secondary performance drops to near-zero with cache full
- Closed
-
SERVER-32890 Background index creation sometimes block whole server for dozen of seconds (even with empty collections)
- Closed
-
WT-1598 Remove the schema, table locks
- Closed
-
SERVER-32424 Use WiredTiger cursor caching
- Closed
-
SERVER-38779 Build a mechanism to periodically cleanup old WT sessions from session cache
- Closed