There is an issue when using a PSSA architecture where one node is hidden with 0 votes and 0 priority. It occurs when the node with 0 votes goes down for some reason and the following write is issued:
db.test.insert({a:1},{writeConcern: {w: 3, wtimeout: 10000}})
This is expected to fail because there are not enough data bearing nodes to satisfy the writeConcern.
The write actually succeeds though:
WriteResult({ "nInserted" : 1 })
In this architecture, only two nodes are required to receive the write for it to be considered replicated to the majority of nodes (because we only consider nodes with a vote when determining the majority). Once both the primary and secondary apply the write, it will be committed and the arbiter will get sent the new lastCommittedOpTime. To determine if the writeConcern is satisfied, the topology coordinator looks at every node in the replica set to see if enough of them have replicated the write. The topology coordinator also asks the arbiter, which will say its lastAppliedOpTime is the lastCommittedOpTime that it was just sent. So even though the write was replicated on only 2 nodes, the topology coordinator thinks that it was replicated to 3 nodes and says that the writeConcern is satisfied.
- causes
-
SERVER-40355 rs.config that contains an _id greater than the number of nodes will crash
- Closed