-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Replication
-
Fully Compatible
-
v8.0
-
Repl 2024-04-29, Repl 2024-05-13, Repl 2024-05-27, Repl 2024-06-10
As mentioned in the FCV README, it's only safe to check a featureFlag while holding the global lock in IX / X to ensure that the FCV doesn't transition all the way from upgraded -> downgraded or downgraded -> upgraded in the lifetime of an operation.
This rule isn't enforced / well-known. And so we have a few cases in the code where we check a featureFlag without holding the global lock in IX / X:
- In createCollection, when checking for collection options (makes secondaries crash - SERVER-88964 will fix it)
- In bulkWrite
- In analyzeCmd
(There might be other cases as well, I haven't checked.)
This means that a node may potentially be in the fully downgraded but allow a command only executable in the upgraded state to run.
Also note that the latter two examples don't seem harmful because the commands don't persist data in a new format. So we might want a way to differentiate between when a command causes data to get persisted in a new format and when a command is cosmetic (like bulkWrite).
It's also worth thinking about what may happen on a sharded cluster if some shards process the command while others reject it.
- is related to
-
SERVER-88964 featureFlagRecordIdsReplicated check should be done while under the global lock in IX
- Backlog
-
SERVER-79269 Invariant that we don't check FCV in oplog application
- Open
- related to
-
SERVER-90971 Secondaries should call into lower level create collection API during oplog application
- Backlog
-
SERVER-91269 Make feature flag checks less racy with setFCV
- Open
- split to
-
SERVER-91213 Cluster Scalability: Audit feature flag checks for unsafe races with setFCV
- Closed
-
SERVER-91215 Query Optimization: Audit feature flag checks for unsafe races with setFCV
- Closed
-
SERVER-91216 Query Execution: Audit feature flag checks for unsafe races with setFCV
- Closed
-
SERVER-91217 Query Integration: Audit feature flag checks for unsafe races with setFCV
- Closed
-
SERVER-91218 Service Arch: Audit feature flag checks for unsafe races with setFCV
- Closed
-
SERVER-91219 Replication: Audit feature flag checks for unsafe races with setFCV
- Closed
-
SERVER-91220 Security: Audit feature flag checks for unsafe races with setFCV
- Closed
-
SERVER-91221 Catalog and Routing: Audit feature flag checks for unsafe races with setFCV
- Closed