Loading...

XML

Word

Printable

JSON

Type: Question
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 3.0.5, 3.0.6
Component/s: Admin
Labels:
None

We have a large sharded cluster where occasionally a secondary will become very slow and requires a restart to fix. All our reads go to secondaries. When inspecting the log of the secondary, I see that over the course of a couple seconds the number of connections goes up by many thousands. Then it will be filled with slow queries (taking over 100ms), which looks like every query hitting the replica. In normal operation these queries only take a few ms. After that I see lots of this sprinkled between the slow queries:

[conn233831] killcursors  keyUpdates:0 writeConflicts:0 numYields:0 locks:{ Global: { acquireCount: { r: 2 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 52009 } }, Database: { acquireCount: { r: 1 } }, Collection: {
acquireCount: { r: 1 } } } 83ms

Putting the replica into maintenance mode (sometimes for many hours) and then putting it back into service does not fix the issue. After putting back into service, the node still continues to serve very slowly. Restarting the mongoD process however does fix the problem. We have experienced this in version 3.0.5 and 3.0.6, with both mmapv1 and wiredTiger storage engines.

Assignee:: Unassigned

Reporter:: Dai Shi

Participants:: Dai Shi, Ramon Fernandez Marina

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: Sep 30 2015 09:52:22 PM UTC

Updated:: Oct 02 2015 03:10:48 PM UTC

Resolved:: Oct 02 2015 03:10:48 PM UTC

Details

Description

Attachments

Activity

People

Dates