-
Type: Improvement
-
Resolution: Won't Fix
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
This new behavior would help in the HELP ticket incident. While we have the Enterprise Watchdog monitoring the storage health the Community edition mongod primary can be stuck on a faulty drive for hours without stepping down. The Watchdog targets this problem fast, but there is no good story for community edition at all.
While the Enterprise Watchdog will continue providing premium services, the Enterprise edition will have a more generic slower solution, however still preventing a multi-hour outage. The reaction time will be different by design, maintaining the service differentiation: Watchdog is capable to detect such outage as fast as 10-30 seconds (based on configuration) while the thread liveness monitor will achieve identical result after 5-10 minutes of outage.
Assigning to shameek.ray to make this blocked on the PM ticket he is creating.