-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 1.8.2
-
Component/s: None
-
None
-
ALL
One of our mongos started spewing out this message:
Mon Aug 29 00:40:36 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
it wrote 6.5 gigabytes of identical lines (same timestamp and all), which filled up the root partition of the server. All queries running through that mongos started failing at the same time, which led to the application failing.
However, mongos is still running, and holding 300 TCP connections open (according to lsof).
From what I can see it started about an hour earlier with mongos not being able to connect to the cluster, which goes on until it can connect to all the nodes except one, and then just minutes before it starts spewing out the messages about too many open files it manages to connect to the last one too. Then it writes the same message until the disk runs out.
- is related to
-
SERVER-3707 Don't try to accept() if out of fds
- Closed
- related to
-
SERVER-3706 Mongos should allocate a lower percentage of fds to connTicketHolder
- Backlog
-
SERVER-3708 Create a BackgroundJob that tracks available fds
- Closed