-
Type: New Feature
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Workload Scheduling
-
(copied to CRM)
Currently when a mongoD process runs out of disk space and fails to preallocate a file or write to the journal, it responds with terminating the server process.
This proves to be a difficult place to be in because the remove operation in and of itself will fail when attempting to reclaim space. Furthermore, things that write to disk temporarily like external sort or temporary agg results will also have problems with this.
A more graceful approach would be to allow us to limit mongoD space utilization to some threshold before filling the disk, so that cleanup and stabilization of the system is facilitated.
Something like "Stop accepting writes (other than removes) if less than 10% (or some number of GB) disk space available" or "If preallocation fails due to lack of space (2GB) for the final datafile, stop accepting writes aside from removes" would be much more graceful. This would of course mean $out and external sorts should fail as well. but would save from dealing with all the other issues associated with full disk.
Of course there are edge cases to be considered such as, if a secondary hits this threshold, it can no longer replicate therefore it should be marked as down or unavailable with respect to the quorum. (Which I believe already happens ) but then how do we process cleanup if it can't replicate the removes? We'll just have to increase capacity or do a full resync in situations where a secondary runs out of disk before a primary.
But for the general case, this would be a huge win, whether the number is configurable or not.
- is duplicated by
-
SERVER-15952 mongod hits assertion when run out of disk space
- Closed
-
SERVER-15959 Running out of disk space should not entirely crash server
- Closed
- is related to
-
SERVER-3759 filesystem ops may cause termination when no space left on device
- Closed