-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.4.3, 5.0.0-rc0
-
Component/s: None
-
None
-
Server Security
-
ALL
-
v5.0, v4.4
-
Execution Team 2021-10-04
SERVER-30888 introduces the possibility that FTDC might be missing serverStatus.wiredTiger, serverStatus.oplog, and/or local.oplog.rs.stats sections. SERVER-48221 may have introduced a similar issue for the serverStatus.oplogTruncation and serverStatus.encryptionAtRest secitions. There might be other similar potentially omitted sections that I didn't find.
This can cause frequent schema changes that reduce FTDC compression efficiency and limit retention. For example, in one deployment FTDC retention was reduced to less than 2 days, compared to a typical retention of closer to a week. The missing data can also cause us to miss important events in FTDC.
It looks to me like the primary issue might be that we're using an extremely short timeout for acquiring the locks needed for collecting these sections, so it might be sufficient to increase the timeout to a substantial fraction of a second, although that needs verification.
- depends on
-
SERVER-70031 Ensure WT is open when generating WiredTiger statistics.
- Closed
- is related to
-
SERVER-30888 Have FTDC code paths obtain locks with a timeout.
- Closed
-
SERVER-48221 Shut down ftdc after storage engine
- Closed
- related to
-
SERVER-60168 Allow serverStatus reading data during the RECOVERING member state
- Open
-
SERVER-33326 Remove use of applyOps/doTxn from sharding chunk operations
- Closed